Dataset statistics
| Number of variables | 73 |
|---|---|
| Number of observations | 488522 |
| Missing cells | 1972026 |
| Missing cells (%) | 5.5% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 272.1 MiB |
| Average record size in memory | 584.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 28 |
| Categorical | 31 |
| Unsupported | 7 |
| DateTime | 1 |
toxval_units_converted has constant value "" | Constant |
toxval_units_standard has constant value "" | Constant |
toxval_units_human has constant value "" | Constant |
toxval_uuid has constant value "" | Constant |
toxval_hash has constant value "" | Constant |
visible has constant value "" | Constant |
toxval_id is highly overall correlated with source and 6 other fields | High correlation |
toxval_numeric is highly overall correlated with toxval_numeric_original | High correlation |
toxval_numeric_original is highly overall correlated with toxval_numeric | High correlation |
species_id is highly overall correlated with human_ra | High correlation |
source is highly overall correlated with toxval_id and 9 other fields | High correlation |
source_url is highly overall correlated with toxval_id and 6 other fields | High correlation |
subsource_url is highly overall correlated with source and 2 other fields | High correlation |
details_text is highly overall correlated with toxval_id and 9 other fields | High correlation |
priority_id is highly overall correlated with toxval_id and 6 other fields | High correlation |
risk_assessment_class is highly overall correlated with toxval_id and 4 other fields | High correlation |
human_eco is highly overall correlated with source and 5 other fields | High correlation |
toxval_numeric_qualifier is highly overall correlated with toxval_numeric_qualifier_original | High correlation |
toxval_numeric_qualifier_original is highly overall correlated with toxval_numeric_qualifier | High correlation |
study_type is highly overall correlated with risk_assessment_class and 1 other fields | High correlation |
study_duration_class is highly overall correlated with subsource_url and 2 other fields | High correlation |
strain_group is highly overall correlated with human_eco and 1 other fields | High correlation |
habitat is highly overall correlated with source and 5 other fields | High correlation |
sex is highly overall correlated with source and 2 other fields | High correlation |
exposure_route is highly overall correlated with target_species | High correlation |
exposure_form is highly overall correlated with habitat and 2 other fields | High correlation |
exposure_form_original is highly overall correlated with habitat and 1 other fields | High correlation |
lifestage is highly overall correlated with lifestage_original and 2 other fields | High correlation |
lifestage_original is highly overall correlated with habitat and 4 other fields | High correlation |
generation is highly overall correlated with lifestage and 2 other fields | High correlation |
generation_original is highly overall correlated with lifestage and 2 other fields | High correlation |
target_species is highly overall correlated with toxval_id and 7 other fields | High correlation |
human_ra is highly overall correlated with toxval_id and 7 other fields | High correlation |
subsource_url is highly imbalanced (99.6%) | Imbalance |
qc_status is highly imbalanced (80.7%) | Imbalance |
human_eco is highly imbalanced (58.3%) | Imbalance |
toxval_numeric_qualifier is highly imbalanced (64.6%) | Imbalance |
toxval_numeric_qualifier_original is highly imbalanced (52.1%) | Imbalance |
study_duration_class is highly imbalanced (86.5%) | Imbalance |
study_duration_units is highly imbalanced (64.8%) | Imbalance |
strain_group is highly imbalanced (59.5%) | Imbalance |
habitat is highly imbalanced (99.7%) | Imbalance |
exposure_route is highly imbalanced (63.5%) | Imbalance |
exposure_form is highly imbalanced (99.9%) | Imbalance |
exposure_form_original is highly imbalanced (99.9%) | Imbalance |
lifestage is highly imbalanced (81.7%) | Imbalance |
lifestage_original is highly imbalanced (82.9%) | Imbalance |
generation is highly imbalanced (77.9%) | Imbalance |
generation_original is highly imbalanced (77.9%) | Imbalance |
human_ra is highly imbalanced (65.1%) | Imbalance |
toxval_numeric_converted has 488522 (100.0%) missing values | Missing |
toxval_numeric_standard has 488522 (100.0%) missing values | Missing |
toxval_numeric_human has 488522 (100.0%) missing values | Missing |
toxval_numeric_qualifier has 13610 (2.8%) missing values | Missing |
source_source_id has 488522 (100.0%) missing values | Missing |
toxval_numeric is highly skewed (γ1 = 218.4009116) | Skewed |
toxval_numeric_original is highly skewed (γ1 = 480.0306699) | Skewed |
mw is highly skewed (γ1 = 371.0191694) | Skewed |
toxval_id has unique values | Unique |
toxval_numeric_converted is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
toxval_numeric_standard is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
toxval_numeric_human is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
study_duration_value_original is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
year is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
year_original is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
source_source_id is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2023-09-26 16:03:59.639466 |
|---|---|
| Analysis finished | 2023-09-26 16:07:30.502545 |
| Duration | 3 minutes and 30.86 seconds |
| Software version | ydata-profiling vv4.5.1 |
| Download configuration | config.json |
toxval_id
Real number (ℝ)
HIGH CORRELATION  UNIQUE 
| Distinct | 488522 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1754911.8 |
| Minimum | 1172305 |
|---|---|
| Maximum | 4460051 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | 1172305 |
|---|---|
| 5-th percentile | 1196731.1 |
| Q1 | 1294435.2 |
| median | 1769474.5 |
| Q3 | 1904407.8 |
| 95-th percentile | 2002111.9 |
| Maximum | 4460051 |
| Range | 3287746 |
| Interquartile range (IQR) | 609972.5 |
Descriptive statistics
| Standard deviation | 636361.91 |
|---|---|
| Coefficient of variation (CV) | 0.36261759 |
| Kurtosis | 10.787304 |
| Mean | 1754911.8 |
| Median Absolute Deviation (MAD) | 180466 |
| Skewness | 3.060569 |
| Sum | 8.5731304 × 1011 |
| Variance | 4.0495648 × 1011 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1172305 | 1 | < 0.1% |
| 1850891 | 1 | < 0.1% |
| 1850903 | 1 | < 0.1% |
| 1850902 | 1 | < 0.1% |
| 1850901 | 1 | < 0.1% |
| 1850900 | 1 | < 0.1% |
| 1850899 | 1 | < 0.1% |
| 1850898 | 1 | < 0.1% |
| 1850897 | 1 | < 0.1% |
| 1850896 | 1 | < 0.1% |
| Other values (488512) | 488512 |
| Value | Count | Frequency (%) |
| 1172305 | 1 | |
| 1172306 | 1 | |
| 1172307 | 1 | |
| 1172308 | 1 | |
| 1172309 | 1 | |
| 1172310 | 1 | |
| 1172311 | 1 | |
| 1172312 | 1 | |
| 1172313 | 1 | |
| 1172314 | 1 |
| Value | Count | Frequency (%) |
| 4460051 | 1 | |
| 4460050 | 1 | |
| 4460049 | 1 | |
| 4460048 | 1 | |
| 4460047 | 1 | |
| 4460046 | 1 | |
| 4460045 | 1 | |
| 4460044 | 1 | |
| 4460043 | 1 | |
| 4460042 | 1 |
source_hash
Text
| Distinct | 483614 |
|---|---|
| Distinct (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 38 |
|---|---|
| Median length | 32 |
| Mean length | 31.980723 |
| Min length | 1 |
Characters and Unicode
| Total characters | 15623287 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 483613 ? |
|---|---|
| Unique (%) | 99.0% |
Sample
| 1st row | 0b0e4e6e5e435d48b4be88e3e9ecd6e4 |
|---|---|
| 2nd row | 22cf87387c639816e5e1006735799f31 |
| 3rd row | f30fe8d16153bc99dc926223e225a889 |
| 4th row | 900d2a78660511f77974e15f4d1c2468 |
| 5th row | 16ac7f18834d0aee5acdabce1ee15686 |
| Value | Count | Frequency (%) |
| 4909 | 1.0% | |
| 37ba345c6099acf44bb97613981d0ca1 | 1 | < 0.1% |
| 16ac7f18834d0aee5acdabce1ee15686 | 1 | < 0.1% |
| 677e3a6c9a6e84d0ffe282e7d21758ce | 1 | < 0.1% |
| b26cb9881b4538bee770b41190046635 | 1 | < 0.1% |
| c048e99b0b78841880210a892ac8611c | 1 | < 0.1% |
| c2dcbc2691830db96be5db500a64848e | 1 | < 0.1% |
| 54427e2f21d43579566d442faf2e97a1 | 1 | < 0.1% |
| e28d68d00b37aa74ae928291eee46b8b | 1 | < 0.1% |
| e8ee4b320718a12965a2755bb1f14f57 | 1 | < 0.1% |
| Other values (483604) | 483604 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 988588 | 6.3% |
| 3 | 980098 | 6.3% |
| 2 | 978565 | 6.3% |
| 8 | 978399 | 6.3% |
| 0 | 978324 | 6.3% |
| 4 | 977976 | 6.3% |
| 9 | 977278 | 6.3% |
| 6 | 977164 | 6.3% |
| 5 | 976659 | 6.3% |
| 7 | 975915 | 6.2% |
| Other values (8) | 5834321 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 9788966 | |
| Lowercase Letter | 5799483 | |
| Connector Punctuation | 29929 | 0.2% |
| Dash Punctuation | 4909 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 988588 | |
| 3 | 980098 | |
| 2 | 978565 | |
| 8 | 978399 | |
| 0 | 978324 | |
| 4 | 977976 | |
| 9 | 977278 | |
| 6 | 977164 | |
| 5 | 976659 | |
| 7 | 975915 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 967634 | |
| c | 967346 | |
| a | 967326 | |
| f | 966561 | |
| d | 965905 | |
| b | 964711 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 29929 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4909 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9823804 | |
| Latin | 5799483 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 988588 | |
| 3 | 980098 | |
| 2 | 978565 | |
| 8 | 978399 | |
| 0 | 978324 | |
| 4 | 977976 | |
| 9 | 977278 | |
| 6 | 977164 | |
| 5 | 976659 | |
| 7 | 975915 | |
| Other values (2) | 34838 | 0.4% |
Latin
| Value | Count | Frequency (%) |
| e | 967634 | |
| c | 967346 | |
| a | 967326 | |
| f | 966561 | |
| d | 965905 | |
| b | 964711 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15623287 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 988588 | 6.3% |
| 3 | 980098 | 6.3% |
| 2 | 978565 | 6.3% |
| 8 | 978399 | 6.3% |
| 0 | 978324 | 6.3% |
| 4 | 977976 | 6.3% |
| 9 | 977278 | 6.3% |
| 6 | 977164 | 6.3% |
| 5 | 976659 | 6.3% |
| 7 | 975915 | 6.2% |
| Other values (8) | 5834321 |
source_table
Text
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 45 |
|---|---|
| Median length | 40 |
| Mean length | 22.322567 |
| Min length | 1 |
Characters and Unicode
| Total characters | 10905065 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | source_iuclid_iuclid_repeateddosetoxicityoral |
|---|---|
| 2nd row | source_iuclid_iuclid_repeateddosetoxicityoral |
| 3rd row | source_iuclid_iuclid_repeateddosetoxicityoral |
| 4th row | source_iuclid_iuclid_repeateddosetoxicityoral |
| 5th row | source_iuclid_iuclid_repeateddosetoxicityoral |
| Value | Count | Frequency (%) |
| direct | 106652 | |
| load | 106652 | |
| source_envirotox | 79988 | |
| source_iuclid_iuclid_acutetoxicityoral | 46251 | 7.8% |
| source_iuclid_iuclid_repeateddosetoxicityoral | 33702 | 5.7% |
| 25194 | 4.2% | |
| source_iuclid_iuclid_developmentaltoxicityter | 22549 | 3.8% |
| source_iuclid_iuclid_acutetoxicitydermal | 19235 | 3.2% |
| source_hpvis | 18075 | 3.0% |
| source_iuclid_iuclid_acutetoxicityinhalation | 17480 | 2.9% |
| Other values (44) | 119396 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 1275230 | |
| c | 1093900 | |
| e | 1018110 | |
| o | 1014248 | |
| u | 778469 | 7.1% |
| r | 755868 | 6.9% |
| t | 744279 | 6.8% |
| _ | 734246 | 6.7% |
| d | 712375 | 6.5% |
| l | 634304 | 5.8% |
| Other values (23) | 2144036 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10028017 | |
| Connector Punctuation | 734246 | 6.7% |
| Space Separator | 106652 | 1.0% |
| Dash Punctuation | 25194 | 0.2% |
| Decimal Number | 10956 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 1275230 | |
| c | 1093900 | |
| e | 1018110 | |
| o | 1014248 | |
| u | 778469 | |
| r | 755868 | |
| t | 744279 | |
| d | 712375 | |
| l | 634304 | |
| s | 513119 | 5.1% |
| Other values (14) | 1488115 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3169 | |
| 1 | 2473 | |
| 2 | 2356 | |
| 5 | 1566 | |
| 4 | 696 | 6.4% |
| 3 | 696 | 6.4% |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 734246 |
Space Separator
| Value | Count | Frequency (%) |
| 106652 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 25194 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10028017 | |
| Common | 877048 | 8.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 1275230 | |
| c | 1093900 | |
| e | 1018110 | |
| o | 1014248 | |
| u | 778469 | |
| r | 755868 | |
| t | 744279 | |
| d | 712375 | |
| l | 634304 | |
| s | 513119 | 5.1% |
| Other values (14) | 1488115 |
Common
| Value | Count | Frequency (%) |
| _ | 734246 | |
| 106652 | 12.2% | |
| - | 25194 | 2.9% |
| 0 | 3169 | 0.4% |
| 1 | 2473 | 0.3% |
| 2 | 2356 | 0.3% |
| 5 | 1566 | 0.2% |
| 4 | 696 | 0.1% |
| 3 | 696 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 10905065 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 1275230 | |
| c | 1093900 | |
| e | 1018110 | |
| o | 1014248 | |
| u | 778469 | 7.1% |
| r | 755868 | 6.9% |
| t | 744279 | 6.8% |
| _ | 734246 | 6.7% |
| d | 712375 | 6.5% |
| l | 634304 | 5.8% |
| Other values (23) | 2144036 |
chemical_id
Text
| Distinct | 119877 |
|---|---|
| Distinct (%) | 24.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 28 |
|---|---|
| Median length | 28 |
| Mean length | 27.987951 |
| Min length | 1 |
Characters and Unicode
| Total characters | 13672730 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 61316 ? |
|---|---|
| Unique (%) | 12.6% |
Sample
| 1st row | ToxVal20111_5683e23c9d49ad53 |
|---|---|
| 2nd row | ToxVal20111_219c2db0693a8ca9 |
| 3rd row | ToxVal20111_b74a50ce531fcc60 |
| 4th row | ToxVal20111_09f8b3377e5beb16 |
| 5th row | ToxVal20111_53e5726f2c6bb8ba |
| Value | Count | Frequency (%) |
| toxval00037_62939fa7957e9119 | 2561 | 0.5% |
| toxval00037_3e14196e68421b91 | 1418 | 0.3% |
| toxval00037_12c14fe32f5e62c5 | 1095 | 0.2% |
| toxval00037_d271d0f1fd16d7a2 | 1013 | 0.2% |
| toxval00037_d24ec2d59849708c | 875 | 0.2% |
| toxval00037_afa1a8d650e7133e | 859 | 0.2% |
| toxval00037_7bf2cccb0b1eebfc | 859 | 0.2% |
| toxval00037_a20bb5a93df0f92a | 814 | 0.2% |
| toxval00037_f84b51fa22435ed7 | 731 | 0.1% |
| toxval00037_fdd6837ad03a5b3d | 721 | 0.1% |
| Other values (119867) | 477576 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1637917 | 12.0% |
| a | 984500 | 7.2% |
| 1 | 935001 | 6.8% |
| 2 | 650931 | 4.8% |
| 6 | 633585 | 4.6% |
| 5 | 620748 | 4.5% |
| 3 | 618601 | 4.5% |
| 7 | 586063 | 4.3% |
| 9 | 564635 | 4.1% |
| 8 | 557841 | 4.1% |
| Other values (13) | 5882908 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 7341743 | |
| Lowercase Letter | 4865857 | |
| Uppercase Letter | 976608 | 7.1% |
| Connector Punctuation | 488304 | 3.6% |
| Dash Punctuation | 218 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1637917 | |
| 1 | 935001 | |
| 2 | 650931 | 8.9% |
| 6 | 633585 | 8.6% |
| 5 | 620748 | 8.5% |
| 3 | 618601 | 8.4% |
| 7 | 586063 | 8.0% |
| 9 | 564635 | 7.7% |
| 8 | 557841 | 7.6% |
| 4 | 536421 | 7.3% |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 984500 | |
| f | 490943 | |
| o | 488304 | |
| l | 488304 | |
| x | 488304 | |
| d | 486768 | |
| c | 481652 | |
| e | 481373 | |
| b | 475709 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 488304 | |
| V | 488304 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 488304 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 218 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 7830265 | |
| Latin | 5842465 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1637917 | |
| 1 | 935001 | |
| 2 | 650931 | 8.3% |
| 6 | 633585 | 8.1% |
| 5 | 620748 | 7.9% |
| 3 | 618601 | 7.9% |
| 7 | 586063 | 7.5% |
| 9 | 564635 | 7.2% |
| 8 | 557841 | 7.1% |
| 4 | 536421 | 6.9% |
| Other values (2) | 488522 | 6.2% |
Latin
| Value | Count | Frequency (%) |
| a | 984500 | |
| f | 490943 | |
| T | 488304 | |
| o | 488304 | |
| l | 488304 | |
| V | 488304 | |
| x | 488304 | |
| d | 486768 | |
| c | 481652 | |
| e | 481373 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13672730 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1637917 | 12.0% |
| a | 984500 | 7.2% |
| 1 | 935001 | 6.8% |
| 2 | 650931 | 4.8% |
| 6 | 633585 | 4.6% |
| 5 | 620748 | 4.5% |
| 3 | 618601 | 4.5% |
| 7 | 586063 | 4.3% |
| 9 | 564635 | 4.1% |
| 8 | 557841 | 4.1% |
| Other values (13) | 5882908 |
dtxsid
Text
| Distinct | 45281 |
|---|---|
| Distinct (%) | 9.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 15 |
|---|---|
| Median length | 13 |
| Mean length | 12.468785 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6091276 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20881 ? |
|---|---|
| Unique (%) | 4.3% |
Sample
| 1st row | DTXSID4021557 |
|---|---|
| 2nd row | NODTXSID |
| 3rd row | DTXSID4044400 |
| 4th row | DTXSID5020607 |
| 5th row | DTXSID90893847 |
| Value | Count | Frequency (%) |
| nodtxsid | 25341 | 5.2% |
| 20733 | 4.2% | |
| dtxsid6034479 | 2704 | 0.6% |
| dtxsid6020226 | 1512 | 0.3% |
| dtxsid7021106 | 1423 | 0.3% |
| dtxsid5021124 | 1148 | 0.2% |
| dtxsid2040315 | 1099 | 0.2% |
| dtxsid9020247 | 1042 | 0.2% |
| dtxsid9020112 | 975 | 0.2% |
| dtxsid10947432 | 946 | 0.2% |
| Other values (45271) | 431599 |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 935578 | |
| 0 | 725253 | |
| 2 | 489113 | |
| T | 467789 | 7.7% |
| X | 467789 | 7.7% |
| S | 467789 | 7.7% |
| I | 467789 | 7.7% |
| 4 | 301169 | 4.9% |
| 1 | 299801 | 4.9% |
| 3 | 274701 | 4.5% |
| Other values (8) | 1194505 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3213127 | |
| Uppercase Letter | 2857416 | |
| Dash Punctuation | 20733 | 0.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 725253 | |
| 2 | 489113 | |
| 4 | 301169 | |
| 1 | 299801 | |
| 3 | 274701 | 8.5% |
| 8 | 229509 | 7.1% |
| 9 | 227824 | 7.1% |
| 5 | 225003 | 7.0% |
| 6 | 221909 | 6.9% |
| 7 | 218845 | 6.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 935578 | |
| T | 467789 | |
| X | 467789 | |
| S | 467789 | |
| I | 467789 | |
| O | 25341 | 0.9% |
| N | 25341 | 0.9% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 20733 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3233860 | |
| Latin | 2857416 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 725253 | |
| 2 | 489113 | |
| 4 | 301169 | |
| 1 | 299801 | |
| 3 | 274701 | 8.5% |
| 8 | 229509 | 7.1% |
| 9 | 227824 | 7.0% |
| 5 | 225003 | 7.0% |
| 6 | 221909 | 6.9% |
| 7 | 218845 | 6.8% |
Latin
| Value | Count | Frequency (%) |
| D | 935578 | |
| T | 467789 | |
| X | 467789 | |
| S | 467789 | |
| I | 467789 | |
| O | 25341 | 0.9% |
| N | 25341 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6091276 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| D | 935578 | |
| 0 | 725253 | |
| 2 | 489113 | |
| T | 467789 | 7.7% |
| X | 467789 | 7.7% |
| S | 467789 | 7.7% |
| I | 467789 | 7.7% |
| 4 | 301169 | 4.9% |
| 1 | 299801 | 4.9% |
| 3 | 274701 | 4.5% |
| Other values (8) | 1194505 |
source
Categorical
HIGH CORRELATION 
| Distinct | 47 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| ECHA IUCLID | |
|---|---|
| EnviroTox_v2 | |
| ToxRefDB | |
| ChemIDplus | |
| HPVIS | |
| Other values (42) |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 10.090287 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4929327 |
|---|---|
| Distinct characters | 57 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | ECHA IUCLID |
|---|---|
| 2nd row | ECHA IUCLID |
| 3rd row | ECHA IUCLID |
| 4th row | ECHA IUCLID |
| 5th row | ECHA IUCLID |
Common Values
| Value | Count | Frequency (%) |
| ECHA IUCLID | 167663 | |
| EnviroTox_v2 | 79988 | |
| ToxRefDB | 56485 | 11.6% |
| ChemIDplus | 48671 | 10.0% |
| HPVIS | 18075 | 3.7% |
| EFSA | 15596 | 3.2% |
| COSMOS | 13904 | 2.8% |
| TEST | 13676 | 2.8% |
| RSL | 13538 | 2.8% |
| DOD | 13461 | 2.8% |
| Other values (37) | 47465 | 9.7% |
Length
| Value | Count | Frequency (%) |
| echa | 167663 | |
| iuclid | 167663 | |
| envirotox_v2 | 79988 | |
| toxrefdb | 56485 | 7.6% |
| chemidplus | 48671 | 6.6% |
| hpvis | 18075 | 2.4% |
| efsa | 15596 | 2.1% |
| doe | 14687 | 2.0% |
| cosmos | 13904 | 1.9% |
| test | 13676 | 1.8% |
| Other values (66) | 146012 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 426286 | 8.6% |
| I | 410640 | 8.3% |
| D | 327122 | 6.6% |
| E | 320486 | 6.5% |
| 253898 | 5.2% | |
| o | 245157 | 5.0% |
| A | 232046 | 4.7% |
| H | 197851 | 4.0% |
| L | 197667 | 4.0% |
| v | 175086 | 3.6% |
| Other values (47) | 2143088 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2926675 | |
| Lowercase Letter | 1567449 | |
| Space Separator | 253898 | 5.2% |
| Decimal Number | 90944 | 1.8% |
| Connector Punctuation | 79988 | 1.6% |
| Dash Punctuation | 4794 | 0.1% |
| Open Punctuation | 2755 | 0.1% |
| Close Punctuation | 2755 | 0.1% |
| Other Punctuation | 69 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 245157 | |
| v | 175086 | |
| e | 152512 | |
| x | 137130 | |
| i | 134931 | |
| r | 127660 | 8.1% |
| n | 104974 | 6.7% |
| s | 58694 | 3.7% |
| f | 57045 | 3.6% |
| l | 54920 | 3.5% |
| Other values (13) | 319340 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 426286 | |
| I | 410640 | |
| D | 327122 | |
| E | 320486 | |
| A | 232046 | |
| H | 197851 | |
| L | 197667 | |
| T | 170198 | 5.8% |
| U | 169599 | 5.8% |
| S | 103575 | 3.5% |
| Other values (12) | 371205 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 82344 | |
| 0 | 3169 | 3.5% |
| 1 | 2473 | 2.7% |
| 5 | 1566 | 1.7% |
| 4 | 696 | 0.8% |
| 3 | 696 | 0.8% |
Space Separator
| Value | Count | Frequency (%) |
| 253898 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 79988 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4794 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2755 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2755 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 69 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4494124 | |
| Common | 435203 | 8.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| C | 426286 | 9.5% |
| I | 410640 | 9.1% |
| D | 327122 | 7.3% |
| E | 320486 | 7.1% |
| o | 245157 | 5.5% |
| A | 232046 | 5.2% |
| H | 197851 | 4.4% |
| L | 197667 | 4.4% |
| v | 175086 | 3.9% |
| T | 170198 | 3.8% |
| Other values (35) | 1791585 |
Common
| Value | Count | Frequency (%) |
| 253898 | ||
| 2 | 82344 | 18.9% |
| _ | 79988 | 18.4% |
| - | 4794 | 1.1% |
| 0 | 3169 | 0.7% |
| ( | 2755 | 0.6% |
| ) | 2755 | 0.6% |
| 1 | 2473 | 0.6% |
| 5 | 1566 | 0.4% |
| 4 | 696 | 0.2% |
| Other values (2) | 765 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4929327 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| C | 426286 | 8.6% |
| I | 410640 | 8.3% |
| D | 327122 | 6.6% |
| E | 320486 | 6.5% |
| 253898 | 5.2% | |
| o | 245157 | 5.0% |
| A | 232046 | 4.7% |
| H | 197851 | 4.0% |
| L | 197667 | 4.0% |
| v | 175086 | 3.6% |
| Other values (47) | 2143088 |
subsource
Text
| Distinct | 94 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 68 |
|---|---|
| Median length | 52 |
| Mean length | 12.292321 |
| Min length | 1 |
Characters and Unicode
| Total characters | 6005069 |
|---|---|
| Distinct characters | 59 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 7 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Repeated Dose Toxicity Oral |
|---|---|
| 2nd row | Repeated Dose Toxicity Oral |
| 3rd row | Repeated Dose Toxicity Oral |
| 4th row | Repeated Dose Toxicity Oral |
| 5th row | Repeated Dose Toxicity Oral |
| Value | Count | Frequency (%) |
| 181775 | ||
| toxicity | 156720 | |
| acute | 82966 | 8.2% |
| oral | 80796 | 8.0% |
| repeated | 51177 | 5.0% |
| dose | 51177 | 5.0% |
| opp_der | 46701 | 4.6% |
| inhalation | 28966 | 2.9% |
| dermal | 25288 | 2.5% |
| developmental | 22549 | 2.2% |
| Other values (122) | 286249 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 551626 | 9.2% |
| 525870 | 8.8% | |
| i | 485059 | 8.1% |
| t | 428299 | 7.1% |
| o | 365505 | 6.1% |
| a | 307760 | 5.1% |
| c | 283438 | 4.7% |
| l | 229165 | 3.8% |
| r | 217335 | 3.6% |
| T | 213793 | 3.6% |
| Other values (49) | 2397219 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4065557 | |
| Uppercase Letter | 1118891 | 18.6% |
| Space Separator | 525870 | 8.8% |
| Dash Punctuation | 181831 | 3.0% |
| Decimal Number | 60933 | 1.0% |
| Connector Punctuation | 47395 | 0.8% |
| Open Punctuation | 2106 | < 0.1% |
| Close Punctuation | 2106 | < 0.1% |
| Other Punctuation | 380 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 551626 | |
| i | 485059 | |
| t | 428299 | |
| o | 365505 | |
| a | 307760 | 7.6% |
| c | 283438 | 7.0% |
| l | 229165 | 5.6% |
| r | 217335 | 5.3% |
| y | 204874 | 5.0% |
| p | 190905 | 4.7% |
| Other values (13) | 801591 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 213793 | |
| A | 181642 | |
| D | 135003 | |
| O | 101855 | |
| S | 77658 | 6.9% |
| E | 72959 | 6.5% |
| F | 66916 | 6.0% |
| R | 63712 | 5.7% |
| P | 38334 | 3.4% |
| C | 34428 | 3.1% |
| Other values (10) | 132591 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 20206 | |
| 2 | 19390 | |
| 3 | 14157 | |
| 1 | 4783 | 7.8% |
| 5 | 1026 | 1.7% |
| 4 | 696 | 1.1% |
| 9 | 675 | 1.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 194 | |
| , | 110 | |
| ' | 58 | 15.3% |
| ; | 18 | 4.7% |
Space Separator
| Value | Count | Frequency (%) |
| 525870 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 181831 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 47395 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2106 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2106 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5184448 | |
| Common | 820621 | 13.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 551626 | 10.6% |
| i | 485059 | 9.4% |
| t | 428299 | 8.3% |
| o | 365505 | 7.1% |
| a | 307760 | 5.9% |
| c | 283438 | 5.5% |
| l | 229165 | 4.4% |
| r | 217335 | 4.2% |
| T | 213793 | 4.1% |
| y | 204874 | 4.0% |
| Other values (33) | 1897594 |
Common
| Value | Count | Frequency (%) |
| 525870 | ||
| - | 181831 | 22.2% |
| _ | 47395 | 5.8% |
| 0 | 20206 | 2.5% |
| 2 | 19390 | 2.4% |
| 3 | 14157 | 1.7% |
| 1 | 4783 | 0.6% |
| ( | 2106 | 0.3% |
| ) | 2106 | 0.3% |
| 5 | 1026 | 0.1% |
| Other values (6) | 1751 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6005069 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 551626 | 9.2% |
| 525870 | 8.8% | |
| i | 485059 | 8.1% |
| t | 428299 | 7.1% |
| o | 365505 | 6.1% |
| a | 307760 | 5.1% |
| c | 283438 | 4.7% |
| l | 229165 | 3.8% |
| r | 217335 | 3.6% |
| T | 213793 | 3.6% |
| Other values (49) | 2397219 |
source_url
Categorical
HIGH CORRELATION 
| Distinct | 36 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| https://echa.europa.eu/information-on-chemicals/registered-substances | |
|---|---|
| - | |
| https://envirotoxdatabase.org/ | |
| https://chemview.epa.gov/chemview/ | |
| https://www.ng.cosmosdb.eu/ | 13904 |
| Other values (31) |
Length
| Max length | 110 |
|---|---|
| Median length | 98 |
| Mean length | 43.003056 |
| Min length | 1 |
Characters and Unicode
| Total characters | 21007939 |
|---|---|
| Distinct characters | 64 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | https://echa.europa.eu/information-on-chemicals/registered-substances |
|---|---|
| 2nd row | https://echa.europa.eu/information-on-chemicals/registered-substances |
| 3rd row | https://echa.europa.eu/information-on-chemicals/registered-substances |
| 4th row | https://echa.europa.eu/information-on-chemicals/registered-substances |
| 5th row | https://echa.europa.eu/information-on-chemicals/registered-substances |
Common Values
| Value | Count | Frequency (%) |
| https://echa.europa.eu/information-on-chemicals/registered-substances | 167663 | |
| - | 123041 | |
| https://envirotoxdatabase.org/ | 79988 | |
| https://chemview.epa.gov/chemview/ | 18075 | 3.7% |
| https://www.ng.cosmosdb.eu/ | 13904 | 2.8% |
| https://www.epa.gov/chemical-research/toxicity-estimation-software-tool-test | 13676 | 2.8% |
| https://www.epa.gov/risk/regional-screening-levels-rsls-generic-tables | 13538 | 2.8% |
| https://phc.amedd.army.mil/Pages/Library.aspx?queries[series]=PHC+Technical+Guide | 13461 | 2.8% |
| https://www.energy.gov/ehss/protective-action-criteria-pac-aegls-erpgs-teels-rev-29-chemicals-concern-may-2016 | 11733 | 2.4% |
| source_url | 5065 | 1.0% |
| Other values (26) | 28378 | 5.8% |
Length
| Value | Count | Frequency (%) |
| https://echa.europa.eu/information-on-chemicals/registered-substances | 167663 | |
| 123041 | ||
| https://envirotoxdatabase.org | 79988 | |
| https://chemview.epa.gov/chemview | 18075 | 3.7% |
| https://www.ng.cosmosdb.eu | 13904 | 2.8% |
| https://www.epa.gov/chemical-research/toxicity-estimation-software-tool-test | 13676 | 2.8% |
| https://www.epa.gov/risk/regional-screening-levels-rsls-generic-tables | 13538 | 2.8% |
| https://phc.amedd.army.mil/pages/library.aspx?queries[series]=phc+technical+guide | 13461 | 2.8% |
| https://www.energy.gov/ehss/protective-action-criteria-pac-aegls-erpgs-teels-rev-29-chemicals-concern-may-2016 | 11733 | 2.4% |
| source_url | 5065 | 1.0% |
| Other values (26) | 28378 | 5.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2175910 | 10.4% |
| s | 1666152 | 7.9% |
| t | 1626298 | 7.7% |
| a | 1437010 | 6.8% |
| / | 1359006 | 6.5% |
| o | 1189015 | 5.7% |
| r | 1151526 | 5.5% |
| i | 1117359 | 5.3% |
| c | 982846 | 4.7% |
| n | 931791 | 4.4% |
| Other values (54) | 7371026 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 17269400 | |
| Other Punctuation | 2447833 | 11.7% |
| Dash Punctuation | 930572 | 4.4% |
| Uppercase Letter | 164427 | 0.8% |
| Decimal Number | 108688 | 0.5% |
| Math Symbol | 44959 | 0.2% |
| Connector Punctuation | 15138 | 0.1% |
| Close Punctuation | 13461 | 0.1% |
| Open Punctuation | 13461 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2175910 | |
| s | 1666152 | |
| t | 1626298 | 9.4% |
| a | 1437010 | 8.3% |
| o | 1189015 | 6.9% |
| r | 1151526 | 6.7% |
| i | 1117359 | 6.5% |
| c | 982846 | 5.7% |
| n | 931791 | 5.4% |
| h | 829084 | 4.8% |
| Other values (15) | 4162409 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 32250 | |
| C | 22013 | |
| L | 21455 | |
| H | 20865 | |
| T | 14462 | |
| G | 13461 | |
| N | 10168 | 6.2% |
| E | 7448 | 4.5% |
| B | 4598 | 2.8% |
| A | 3080 | 1.9% |
| Other values (8) | 14627 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 29164 | |
| 0 | 20851 | |
| 1 | 18627 | |
| 9 | 15401 | |
| 6 | 13365 | |
| 3 | 3482 | 3.2% |
| 8 | 3130 | 2.9% |
| 4 | 2240 | 2.1% |
| 7 | 1754 | 1.6% |
| 5 | 674 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 1359006 | |
| . | 706678 | |
| : | 361934 | 14.8% |
| ? | 18037 | 0.7% |
| % | 2178 | 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 26922 | |
| = | 18037 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 930572 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 15138 |
Close Punctuation
| Value | Count | Frequency (%) |
| ] | 13461 |
Open Punctuation
| Value | Count | Frequency (%) |
| [ | 13461 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17433827 | |
| Common | 3574112 | 17.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2175910 | |
| s | 1666152 | 9.6% |
| t | 1626298 | 9.3% |
| a | 1437010 | 8.2% |
| o | 1189015 | 6.8% |
| r | 1151526 | 6.6% |
| i | 1117359 | 6.4% |
| c | 982846 | 5.6% |
| n | 931791 | 5.3% |
| h | 829084 | 4.8% |
| Other values (33) | 4326836 |
Common
| Value | Count | Frequency (%) |
| / | 1359006 | |
| - | 930572 | |
| . | 706678 | |
| : | 361934 | 10.1% |
| 2 | 29164 | 0.8% |
| + | 26922 | 0.8% |
| 0 | 20851 | 0.6% |
| 1 | 18627 | 0.5% |
| ? | 18037 | 0.5% |
| = | 18037 | 0.5% |
| Other values (11) | 84284 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21007939 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 2175910 | 10.4% |
| s | 1666152 | 7.9% |
| t | 1626298 | 7.7% |
| a | 1437010 | 6.8% |
| / | 1359006 | 6.5% |
| o | 1189015 | 5.7% |
| r | 1151526 | 5.5% |
| i | 1117359 | 5.3% |
| c | 982846 | 4.7% |
| n | 931791 | 4.4% |
| Other values (54) | 7371026 |
subsource_url
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| subsource_url | 156 |
Length
| Max length | 13 |
|---|---|
| Median length | 1 |
| Mean length | 1.003832 |
| Min length | 1 |
Characters and Unicode
| Total characters | 490394 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488366 | |
| subsource_url | 156 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488366 | ||
| subsource_url | 156 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488366 | |
| u | 468 | 0.1% |
| s | 312 | 0.1% |
| r | 312 | 0.1% |
| b | 156 | < 0.1% |
| o | 156 | < 0.1% |
| c | 156 | < 0.1% |
| e | 156 | < 0.1% |
| _ | 156 | < 0.1% |
| l | 156 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488366 | |
| Lowercase Letter | 1872 | 0.4% |
| Connector Punctuation | 156 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 468 | |
| s | 312 | |
| r | 312 | |
| b | 156 | 8.3% |
| o | 156 | 8.3% |
| c | 156 | 8.3% |
| e | 156 | 8.3% |
| l | 156 | 8.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488366 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 156 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 | |
| Latin | 1872 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| u | 468 | |
| s | 312 | |
| r | 312 | |
| b | 156 | 8.3% |
| o | 156 | 8.3% |
| c | 156 | 8.3% |
| e | 156 | 8.3% |
| l | 156 | 8.3% |
Common
| Value | Count | Frequency (%) |
| - | 488366 | |
| _ | 156 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 490394 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488366 | |
| u | 468 | 0.1% |
| s | 312 | 0.1% |
| r | 312 | 0.1% |
| b | 156 | < 0.1% |
| o | 156 | < 0.1% |
| c | 156 | < 0.1% |
| e | 156 | < 0.1% |
| _ | 156 | < 0.1% |
| l | 156 | < 0.1% |
details_text
Categorical
HIGH CORRELATION 
| Distinct | 47 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| ECHA IUCLID Details | |
|---|---|
| EnviroTox_v2 Details | |
| ToxRefDB Details | |
| ChemIDPlus Details | |
| HPVIS Details | |
| Other values (42) |
Length
| Max length | 38 |
|---|---|
| Median length | 35 |
| Mean length | 18.090287 |
| Min length | 11 |
Characters and Unicode
| Total characters | 8837503 |
|---|---|
| Distinct characters | 57 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | ECHA IUCLID Details |
|---|---|
| 2nd row | ECHA IUCLID Details |
| 3rd row | ECHA IUCLID Details |
| 4th row | ECHA IUCLID Details |
| 5th row | ECHA IUCLID Details |
Common Values
| Value | Count | Frequency (%) |
| ECHA IUCLID Details | 167663 | |
| EnviroTox_v2 Details | 79988 | |
| ToxRefDB Details | 56485 | 11.6% |
| ChemIDPlus Details | 48671 | 10.0% |
| HPVIS Details | 18075 | 3.7% |
| EFSA Details | 15596 | 3.2% |
| COSMOS Details | 13904 | 2.8% |
| TEST Details | 13676 | 2.8% |
| RSL Details | 13538 | 2.8% |
| DOD Details | 13461 | 2.8% |
| Other values (37) | 47465 | 9.7% |
Length
| Value | Count | Frequency (%) |
| details | 488522 | |
| echa | 167663 | 13.6% |
| iuclid | 167663 | 13.6% |
| envirotox_v2 | 79988 | 6.5% |
| toxrefdb | 56485 | 4.6% |
| chemidplus | 48671 | 4.0% |
| hpvis | 18075 | 1.5% |
| efsa | 15596 | 1.3% |
| doe | 14687 | 1.2% |
| cosmos | 13904 | 1.1% |
| Other values (67) | 159688 | 13.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 815644 | 9.2% |
| 742420 | 8.4% | |
| e | 641034 | 7.3% |
| i | 623453 | 7.1% |
| s | 547216 | 6.2% |
| l | 543442 | 6.1% |
| t | 542282 | 6.1% |
| a | 514386 | 5.8% |
| C | 426286 | 4.8% |
| I | 410640 | 4.6% |
| Other values (47) | 3030700 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4449910 | |
| Uppercase Letter | 3463868 | |
| Space Separator | 742420 | 8.4% |
| Decimal Number | 90944 | 1.0% |
| Connector Punctuation | 79988 | 0.9% |
| Dash Punctuation | 4794 | 0.1% |
| Close Punctuation | 2755 | < 0.1% |
| Open Punctuation | 2755 | < 0.1% |
| Other Punctuation | 69 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 641034 | |
| i | 623453 | |
| s | 547216 | |
| l | 543442 | |
| t | 542282 | |
| a | 514386 | |
| o | 245157 | 5.5% |
| v | 175086 | 3.9% |
| x | 137130 | 3.1% |
| r | 127660 | 2.9% |
| Other values (13) | 353064 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 815644 | |
| C | 426286 | |
| I | 410640 | |
| E | 320486 | 9.3% |
| A | 232046 | 6.7% |
| H | 197851 | 5.7% |
| L | 197667 | 5.7% |
| T | 170198 | 4.9% |
| U | 169599 | 4.9% |
| P | 120958 | 3.5% |
| Other values (12) | 402493 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 82344 | |
| 0 | 3169 | 3.5% |
| 1 | 2473 | 2.7% |
| 5 | 1566 | 1.7% |
| 4 | 696 | 0.8% |
| 3 | 696 | 0.8% |
Space Separator
| Value | Count | Frequency (%) |
| 742420 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 79988 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 4794 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2755 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2755 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 69 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7913778 | |
| Common | 923725 | 10.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| D | 815644 | 10.3% |
| e | 641034 | 8.1% |
| i | 623453 | 7.9% |
| s | 547216 | 6.9% |
| l | 543442 | 6.9% |
| t | 542282 | 6.9% |
| a | 514386 | 6.5% |
| C | 426286 | 5.4% |
| I | 410640 | 5.2% |
| E | 320486 | 4.0% |
| Other values (35) | 2528909 |
Common
| Value | Count | Frequency (%) |
| 742420 | ||
| 2 | 82344 | 8.9% |
| _ | 79988 | 8.7% |
| - | 4794 | 0.5% |
| 0 | 3169 | 0.3% |
| ) | 2755 | 0.3% |
| ( | 2755 | 0.3% |
| 1 | 2473 | 0.3% |
| 5 | 1566 | 0.2% |
| 4 | 696 | 0.1% |
| Other values (2) | 765 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8837503 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| D | 815644 | 9.2% |
| 742420 | 8.4% | |
| e | 641034 | 7.3% |
| i | 623453 | 7.1% |
| s | 547216 | 6.2% |
| l | 543442 | 6.1% |
| t | 542282 | 6.1% |
| a | 514386 | 5.8% |
| C | 426286 | 4.8% |
| I | 410640 | 4.6% |
| Other values (47) | 3030700 |
priority_id
Categorical
HIGH CORRELATION 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| 5 | |
|---|---|
| 4 | |
| 3 | 34813 |
| 1 | 16057 |
| 2 | 14516 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 5 |
|---|---|
| 2nd row | 5 |
| 3rd row | 5 |
| 4th row | 5 |
| 5th row | 5 |
Common Values
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 488522 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5 | 366109 | |
| 4 | 57027 | 11.7% |
| 3 | 34813 | 7.1% |
| 1 | 16057 | 3.3% |
| 2 | 14516 | 3.0% |
qc_status
Categorical
IMBALANCE 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| pass | |
|---|---|
| fail:dtxsid not specified | 25046 |
| fail:human_eco not specified | 8225 |
| fail:toxval_units not specified | 6917 |
| fail:toxval_type not specified | 3356 |
| Other values (3) | 27 |
Length
| Max length | 40 |
|---|---|
| Median length | 4 |
| Mean length | 6.0426716 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2951978 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fail:toxval_units not specified |
|---|---|
| 2nd row | fail:dtxsid not specified |
| 3rd row | fail:toxval_units not specified |
| 4th row | pass |
| 5th row | pass |
Common Values
| Value | Count | Frequency (%) |
| pass | 444951 | |
| fail:dtxsid not specified | 25046 | 5.1% |
| fail:human_eco not specified | 8225 | 1.7% |
| fail:toxval_units not specified | 6917 | 1.4% |
| fail:toxval_type not specified | 3356 | 0.7% |
| fail:toxval_numeric<0 | 23 | < 0.1% |
| fail:toxval_numeric is null | 2 | < 0.1% |
| fail:risk_assessment_class not specified | 2 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pass | 444951 | |
| not | 43546 | 7.6% |
| specified | 43546 | 7.6% |
| fail:dtxsid | 25046 | 4.4% |
| fail:human_eco | 8225 | 1.4% |
| fail:toxval_units | 6917 | 1.2% |
| fail:toxval_type | 3356 | 0.6% |
| fail:toxval_numeric<0 | 23 | < 0.1% |
| fail:toxval_numeric | 2 | < 0.1% |
| is | 2 | < 0.1% |
| Other values (2) | 4 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 965427 | |
| a | 507049 | |
| p | 491853 | |
| i | 162655 | 5.5% |
| e | 98702 | 3.3% |
| d | 93638 | 3.2% |
| t | 89165 | 3.0% |
| f | 87117 | 3.0% |
| 87096 | 3.0% | |
| o | 62069 | 2.1% |
| Other values (15) | 307207 | 10.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2802738 | |
| Space Separator | 87096 | 3.0% |
| Other Punctuation | 43571 | 1.5% |
| Connector Punctuation | 18527 | 0.6% |
| Math Symbol | 23 | < 0.1% |
| Decimal Number | 23 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 965427 | |
| a | 507049 | |
| p | 491853 | |
| i | 162655 | 5.8% |
| e | 98702 | 3.5% |
| d | 93638 | 3.3% |
| t | 89165 | 3.2% |
| f | 87117 | 3.1% |
| o | 62069 | 2.2% |
| n | 58717 | 2.1% |
| Other values (10) | 186346 | 6.6% |
Space Separator
| Value | Count | Frequency (%) |
| 87096 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 43571 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 18527 |
Math Symbol
| Value | Count | Frequency (%) |
| < | 23 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 23 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2802738 | |
| Common | 149240 | 5.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 965427 | |
| a | 507049 | |
| p | 491853 | |
| i | 162655 | 5.8% |
| e | 98702 | 3.5% |
| d | 93638 | 3.3% |
| t | 89165 | 3.2% |
| f | 87117 | 3.1% |
| o | 62069 | 2.2% |
| n | 58717 | 2.1% |
| Other values (10) | 186346 | 6.6% |
Common
| Value | Count | Frequency (%) |
| 87096 | ||
| : | 43571 | |
| _ | 18527 | 12.4% |
| < | 23 | < 0.1% |
| 0 | 23 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2951978 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 965427 | |
| a | 507049 | |
| p | 491853 | |
| i | 162655 | 5.5% |
| e | 98702 | 3.3% |
| d | 93638 | 3.2% |
| t | 89165 | 3.0% |
| f | 87117 | 3.0% |
| 87096 | 3.0% | |
| o | 62069 | 2.1% |
| Other values (15) | 307207 | 10.4% |
risk_assessment_class
Categorical
HIGH CORRELATION 
| Distinct | 30 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| acute | |
|---|---|
| chronic | |
| subchronic | |
| developmental | |
| short-term | |
| Other values (25) |
Length
| Max length | 26 |
|---|---|
| Median length | 5 |
| Mean length | 8.2424701 |
| Min length | 1 |
Characters and Unicode
| Total characters | 4026628 |
|---|---|
| Distinct characters | 26 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | short-term |
|---|---|
| 2nd row | short-term |
| 3rd row | short-term |
| 4th row | short-term |
| 5th row | subchronic |
Common Values
| Value | Count | Frequency (%) |
| acute | 245254 | |
| chronic | 66706 | 13.7% |
| subchronic | 45249 | 9.3% |
| developmental | 34008 | 7.0% |
| short-term | 28940 | 5.9% |
| air quality standard | 16819 | 3.4% |
| reproduction | 15252 | 3.1% |
| water quality standard | 12946 | 2.7% |
| genotoxicity | 4848 | 1.0% |
| soil quality standard | 3959 | 0.8% |
| Other values (20) | 14541 | 3.0% |
Length
| Value | Count | Frequency (%) |
| acute | 245308 | |
| chronic | 66713 | 11.8% |
| subchronic | 45280 | 8.0% |
| standard | 34476 | 6.1% |
| developmental | 34391 | 6.1% |
| quality | 33724 | 6.0% |
| short-term | 28975 | 5.1% |
| air | 16819 | 3.0% |
| reproduction | 15635 | 2.8% |
| water | 13698 | 2.4% |
| Other values (20) | 31659 | 5.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 498262 | |
| t | 462369 | |
| e | 434707 | |
| a | 416332 | |
| u | 347885 | |
| r | 280369 | 7.0% |
| o | 238303 | 5.9% |
| i | 208135 | 5.2% |
| n | 206688 | 5.1% |
| h | 150300 | 3.7% |
| Other values (16) | 783278 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3919179 | |
| Space Separator | 78156 | 1.9% |
| Dash Punctuation | 28977 | 0.7% |
| Uppercase Letter | 316 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 498262 | |
| t | 462369 | |
| e | 434707 | |
| a | 416332 | |
| u | 347885 | |
| r | 280369 | |
| o | 238303 | 6.1% |
| i | 208135 | 5.3% |
| n | 206688 | 5.3% |
| h | 150300 | 3.8% |
| Other values (13) | 675829 |
Space Separator
| Value | Count | Frequency (%) |
| 78156 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 28977 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 316 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3919495 | |
| Common | 107133 | 2.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| c | 498262 | |
| t | 462369 | |
| e | 434707 | |
| a | 416332 | |
| u | 347885 | |
| r | 280369 | |
| o | 238303 | 6.1% |
| i | 208135 | 5.3% |
| n | 206688 | 5.3% |
| h | 150300 | 3.8% |
| Other values (14) | 676145 |
Common
| Value | Count | Frequency (%) |
| 78156 | ||
| - | 28977 | 27.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4026628 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 498262 | |
| t | 462369 | |
| e | 434707 | |
| a | 416332 | |
| u | 347885 | |
| r | 280369 | 7.0% |
| o | 238303 | 5.9% |
| i | 208135 | 5.2% |
| n | 206688 | 5.1% |
| h | 150300 | 3.7% |
| Other values (16) | 783278 |
human_eco
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| human health | |
|---|---|
| eco | |
| not specified | 4909 |
| microorganisms | 1911 |
Length
| Max length | 14 |
|---|---|
| Median length | 12 |
| Mean length | 10.221841 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4993594 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | human health |
|---|---|
| 2nd row | human health |
| 3rd row | human health |
| 4th row | human health |
| 5th row | human health |
Common Values
| Value | Count | Frequency (%) |
| human health | 384213 | |
| eco | 97489 | 20.0% |
| not specified | 4909 | 1.0% |
| microorganisms | 1911 | 0.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| human | 384213 | |
| health | 384213 | |
| eco | 97489 | 11.1% |
| not | 4909 | 0.6% |
| specified | 4909 | 0.6% |
| microorganisms | 1911 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| h | 1152639 | |
| a | 770337 | |
| e | 491520 | |
| n | 391033 | 7.8% |
| 389122 | 7.8% | |
| t | 389122 | 7.8% |
| m | 388035 | 7.8% |
| l | 384213 | 7.7% |
| u | 384213 | 7.7% |
| o | 106220 | 2.1% |
| Other values (8) | 147140 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4604472 | |
| Space Separator | 389122 | 7.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| h | 1152639 | |
| a | 770337 | |
| e | 491520 | |
| n | 391033 | 8.5% |
| t | 389122 | 8.5% |
| m | 388035 | 8.4% |
| l | 384213 | 8.3% |
| u | 384213 | 8.3% |
| o | 106220 | 2.3% |
| c | 104309 | 2.3% |
| Other values (7) | 42831 | 0.9% |
Space Separator
| Value | Count | Frequency (%) |
| 389122 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4604472 | |
| Common | 389122 | 7.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| h | 1152639 | |
| a | 770337 | |
| e | 491520 | |
| n | 391033 | 8.5% |
| t | 389122 | 8.5% |
| m | 388035 | 8.4% |
| l | 384213 | 8.3% |
| u | 384213 | 8.3% |
| o | 106220 | 2.3% |
| c | 104309 | 2.3% |
| Other values (7) | 42831 | 0.9% |
Common
| Value | Count | Frequency (%) |
| 389122 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4993594 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| h | 1152639 | |
| a | 770337 | |
| e | 491520 | |
| n | 391033 | 7.8% |
| 389122 | 7.8% | |
| t | 389122 | 7.8% |
| m | 388035 | 7.8% |
| l | 384213 | 7.7% |
| u | 384213 | 7.7% |
| o | 106220 | 2.1% |
| Other values (8) | 147140 | 2.9% |
toxval_type
Text
| Distinct | 259 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 52 |
|---|---|
| Median length | 4 |
| Mean length | 5.047353 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2465743 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 54 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | NOAEL |
|---|---|
| 2nd row | NOAEL |
| 3rd row | LOEL |
| 4th row | LOAEL |
| 5th row | NOAEL |
| Value | Count | Frequency (%) |
| ld50 | 129213 | |
| noael | 70305 | |
| lc50 | 67058 | |
| loael | 26652 | 4.9% |
| lel | 23280 | 4.3% |
| ec50 | 22166 | 4.1% |
| nel | 14689 | 2.7% |
| noec | 14615 | 2.7% |
| noel | 14149 | 2.6% |
| meg | 13461 | 2.5% |
| Other values (273) | 147389 |
Most occurring characters
| Value | Count | Frequency (%) |
| L | 436575 | |
| E | 235756 | 9.6% |
| 0 | 230347 | 9.3% |
| 5 | 220831 | 9.0% |
| O | 143095 | 5.8% |
| C | 141373 | 5.7% |
| D | 139791 | 5.7% |
| A | 133958 | 5.4% |
| N | 132674 | 5.4% |
| e | 58422 | 2.4% |
| Other values (55) | 592921 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1452269 | |
| Decimal Number | 467474 | 19.0% |
| Lowercase Letter | 454167 | 18.4% |
| Space Separator | 54455 | 2.2% |
| Dash Punctuation | 18156 | 0.7% |
| Open Punctuation | 6902 | 0.3% |
| Close Punctuation | 6902 | 0.3% |
| Other Punctuation | 5304 | 0.2% |
| Math Symbol | 114 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 58422 | |
| i | 48730 | |
| a | 36328 | 8.0% |
| c | 35306 | 7.8% |
| n | 35154 | 7.7% |
| r | 31867 | 7.0% |
| t | 29861 | 6.6% |
| l | 27348 | 6.0% |
| s | 23155 | 5.1% |
| o | 19035 | 4.2% |
| Other values (15) | 108961 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 436575 | |
| E | 235756 | |
| O | 143095 | 9.9% |
| C | 141373 | 9.7% |
| D | 139791 | 9.6% |
| A | 133958 | 9.2% |
| N | 132674 | 9.1% |
| G | 18664 | 1.3% |
| M | 17026 | 1.2% |
| P | 12929 | 0.9% |
| Other values (10) | 40428 | 2.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 230347 | |
| 5 | 220831 | |
| 1 | 7507 | 1.6% |
| 2 | 4566 | 1.0% |
| 3 | 3965 | 0.8% |
| 9 | 85 | < 0.1% |
| 7 | 70 | < 0.1% |
| 4 | 57 | < 0.1% |
| 6 | 33 | < 0.1% |
| 8 | 13 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 3715 | |
| , | 1577 | |
| % | 9 | 0.2% |
| ' | 2 | < 0.1% |
| ; | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 54455 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 18156 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 6902 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 6902 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 114 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1906436 | |
| Common | 559307 | 22.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| L | 436575 | |
| E | 235756 | |
| O | 143095 | 7.5% |
| C | 141373 | 7.4% |
| D | 139791 | 7.3% |
| A | 133958 | 7.0% |
| N | 132674 | 7.0% |
| e | 58422 | 3.1% |
| i | 48730 | 2.6% |
| a | 36328 | 1.9% |
| Other values (35) | 399734 |
Common
| Value | Count | Frequency (%) |
| 0 | 230347 | |
| 5 | 220831 | |
| 54455 | 9.7% | |
| - | 18156 | 3.2% |
| 1 | 7507 | 1.3% |
| ( | 6902 | 1.2% |
| ) | 6902 | 1.2% |
| 2 | 4566 | 0.8% |
| 3 | 3965 | 0.7% |
| . | 3715 | 0.7% |
| Other values (10) | 1961 | 0.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2465743 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| L | 436575 | |
| E | 235756 | 9.6% |
| 0 | 230347 | 9.3% |
| 5 | 220831 | 9.0% |
| O | 143095 | 5.8% |
| C | 141373 | 5.7% |
| D | 139791 | 5.7% |
| A | 133958 | 5.4% |
| N | 132674 | 5.4% |
| e | 58422 | 2.4% |
| Other values (55) | 592921 |
| Distinct | 978 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 207 |
|---|---|
| Median length | 4 |
| Mean length | 5.374454 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2625539 |
|---|---|
| Distinct characters | 83 |
| Distinct categories | 11 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 512 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | NOAEL |
|---|---|
| 2nd row | NOAEL |
| 3rd row | LOEL |
| 4th row | LOAEL |
| 5th row | NOAEL |
| Value | Count | Frequency (%) |
| ld50 | 129158 | |
| noael | 70190 | |
| lc50 | 67053 | 11.9% |
| loael | 26271 | 4.7% |
| ec50 | 22165 | 3.9% |
| lel | 21490 | 3.8% |
| nel | 14673 | 2.6% |
| noec | 14612 | 2.6% |
| noel | 14143 | 2.5% |
| meg | 13461 | 2.4% |
| Other values (1139) | 168793 |
Most occurring characters
| Value | Count | Frequency (%) |
| L | 438358 | |
| E | 234741 | 8.9% |
| 0 | 230234 | 8.8% |
| 5 | 220855 | 8.4% |
| O | 148428 | 5.7% |
| C | 146201 | 5.6% |
| D | 142558 | 5.4% |
| A | 135705 | 5.2% |
| N | 133076 | 5.1% |
| e | 80435 | 3.1% |
| Other values (73) | 714948 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1504251 | |
| Lowercase Letter | 538847 | 20.5% |
| Decimal Number | 467533 | 17.8% |
| Space Separator | 73500 | 2.8% |
| Connector Punctuation | 12341 | 0.5% |
| Other Punctuation | 8018 | 0.3% |
| Open Punctuation | 7058 | 0.3% |
| Close Punctuation | 7044 | 0.3% |
| Dash Punctuation | 6825 | 0.3% |
| Math Symbol | 121 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 80435 | |
| i | 51398 | 9.5% |
| n | 41855 | 7.8% |
| t | 41165 | 7.6% |
| a | 38436 | 7.1% |
| c | 37698 | 7.0% |
| r | 35637 | 6.6% |
| o | 34559 | 6.4% |
| f | 24247 | 4.5% |
| l | 24157 | 4.5% |
| Other values (16) | 129260 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 438358 | |
| E | 234741 | |
| O | 148428 | 9.9% |
| C | 146201 | 9.7% |
| D | 142558 | 9.5% |
| A | 135705 | 9.0% |
| N | 133076 | 8.8% |
| P | 21206 | 1.4% |
| G | 20243 | 1.3% |
| S | 20206 | 1.3% |
| Other values (15) | 63529 | 4.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 4462 | |
| : | 3256 | |
| , | 164 | 2.0% |
| % | 57 | 0.7% |
| / | 35 | 0.4% |
| * | 29 | 0.4% |
| ? | 7 | 0.1% |
| ; | 3 | < 0.1% |
| " | 2 | < 0.1% |
| ' | 2 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 230234 | |
| 5 | 220855 | |
| 1 | 7550 | 1.6% |
| 2 | 4593 | 1.0% |
| 3 | 3988 | 0.9% |
| 9 | 88 | < 0.1% |
| 4 | 85 | < 0.1% |
| 7 | 71 | < 0.1% |
| 6 | 49 | < 0.1% |
| 8 | 20 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 116 | |
| = | 3 | 2.5% |
| > | 2 | 1.7% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 7057 | |
| [ | 1 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 7043 | |
| ] | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 73500 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 12341 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 6825 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2043098 | |
| Common | 582441 | 22.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| L | 438358 | |
| E | 234741 | |
| O | 148428 | 7.3% |
| C | 146201 | 7.2% |
| D | 142558 | 7.0% |
| A | 135705 | 6.6% |
| N | 133076 | 6.5% |
| e | 80435 | 3.9% |
| i | 51398 | 2.5% |
| n | 41855 | 2.0% |
| Other values (41) | 490343 |
Common
| Value | Count | Frequency (%) |
| 0 | 230234 | |
| 5 | 220855 | |
| 73500 | 12.6% | |
| _ | 12341 | 2.1% |
| 1 | 7550 | 1.3% |
| ( | 7057 | 1.2% |
| ) | 7043 | 1.2% |
| - | 6825 | 1.2% |
| 2 | 4593 | 0.8% |
| . | 4462 | 0.8% |
| Other values (22) | 7981 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2625539 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| L | 438358 | |
| E | 234741 | 8.9% |
| 0 | 230234 | 8.8% |
| 5 | 220855 | 8.4% |
| O | 148428 | 5.7% |
| C | 146201 | 5.6% |
| D | 142558 | 5.4% |
| A | 135705 | 5.2% |
| N | 133076 | 5.1% |
| e | 80435 | 3.1% |
| Other values (73) | 714948 |
toxval_subtype
Text
| Distinct | 153 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 76 |
|---|---|
| Median length | 1 |
| Mean length | 2.4315671 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1187874 |
|---|---|
| Distinct characters | 61 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 10 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 452339 | ||
| pac | 11733 | 2.0% |
| air | 11619 | 2.0% |
| short-term | 11592 | 2.0% |
| thq | 11309 | 2.0% |
| 1 | 9515 | 1.7% |
| negligible | 7014 | 1.2% |
| 0.1 | 5705 | 1.0% |
| 2 | 3911 | 0.7% |
| 3 | 3911 | 0.7% |
| Other values (128) | 43987 | 7.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 459673 | |
| 84113 | 7.1% | |
| r | 58511 | 4.9% |
| i | 56618 | 4.8% |
| e | 46647 | 3.9% |
| t | 39887 | 3.4% |
| l | 32483 | 2.7% |
| A | 27460 | 2.3% |
| h | 27099 | 2.3% |
| a | 26093 | 2.2% |
| Other values (51) | 329290 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 459673 | |
| Lowercase Letter | 443364 | |
| Uppercase Letter | 112755 | 9.5% |
| Space Separator | 84113 | 7.1% |
| Decimal Number | 50675 | 4.3% |
| Other Punctuation | 16156 | 1.4% |
| Math Symbol | 14320 | 1.2% |
| Close Punctuation | 3409 | 0.3% |
| Open Punctuation | 3409 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 58511 | |
| i | 56618 | |
| e | 46647 | |
| t | 39887 | |
| l | 32483 | 7.3% |
| h | 27099 | 6.1% |
| a | 26093 | 5.9% |
| o | 23941 | 5.4% |
| g | 21312 | 4.8% |
| n | 20858 | 4.7% |
| Other values (13) | 89915 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 27460 | |
| C | 15008 | |
| T | 14301 | |
| S | 13356 | |
| P | 11733 | |
| N | 7449 | 6.6% |
| L | 7137 | 6.3% |
| E | 5866 | 5.2% |
| G | 3348 | 3.0% |
| M | 3239 | 2.9% |
| Other values (6) | 3858 | 3.4% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 19087 | |
| 0 | 12704 | |
| 2 | 6702 | 13.2% |
| 3 | 5851 | 11.5% |
| 5 | 2688 | 5.3% |
| 6 | 2229 | 4.4% |
| 8 | 700 | 1.4% |
| 4 | 680 | 1.3% |
| 7 | 19 | < 0.1% |
| 9 | 15 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 12857 | |
| > | 731 | 5.1% |
| < | 727 | 5.1% |
| + | 5 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 5710 | |
| , | 4864 | |
| : | 4351 | |
| / | 1231 | 7.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 459673 |
Space Separator
| Value | Count | Frequency (%) |
| 84113 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3409 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3409 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 631755 | |
| Latin | 556119 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 58511 | 10.5% |
| i | 56618 | 10.2% |
| e | 46647 | 8.4% |
| t | 39887 | 7.2% |
| l | 32483 | 5.8% |
| A | 27460 | 4.9% |
| h | 27099 | 4.9% |
| a | 26093 | 4.7% |
| o | 23941 | 4.3% |
| g | 21312 | 3.8% |
| Other values (29) | 196068 |
Common
| Value | Count | Frequency (%) |
| - | 459673 | |
| 84113 | 13.3% | |
| 1 | 19087 | 3.0% |
| = | 12857 | 2.0% |
| 0 | 12704 | 2.0% |
| 2 | 6702 | 1.1% |
| 3 | 5851 | 0.9% |
| . | 5710 | 0.9% |
| , | 4864 | 0.8% |
| : | 4351 | 0.7% |
| Other values (12) | 15843 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1187874 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 459673 | |
| 84113 | 7.1% | |
| r | 58511 | 4.9% |
| i | 56618 | 4.8% |
| e | 46647 | 3.9% |
| t | 39887 | 3.4% |
| l | 32483 | 2.7% |
| A | 27460 | 2.3% |
| h | 27099 | 2.3% |
| a | 26093 | 2.2% |
| Other values (51) | 329290 |
| Distinct | 175 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 76 |
|---|---|
| Median length | 1 |
| Mean length | 2.4438981 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1193898 |
|---|---|
| Distinct characters | 64 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 14 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 451612 | ||
| air | 11619 | 2.1% |
| short-term | 11592 | 2.1% |
| thq | 11309 | 2.0% |
| negligible | 7014 | 1.3% |
| 0.1 | 5705 | 1.0% |
| 1 | 5604 | 1.0% |
| pac_3 | 3911 | 0.7% |
| pac_2 | 3911 | 0.7% |
| pac_1 | 3911 | 0.7% |
| Other values (155) | 44613 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 459154 | |
| 72279 | 6.1% | |
| r | 57580 | 4.8% |
| i | 55946 | 4.7% |
| e | 47423 | 4.0% |
| t | 38653 | 3.2% |
| l | 31960 | 2.7% |
| A | 29766 | 2.5% |
| h | 26705 | 2.2% |
| o | 24609 | 2.1% |
| Other values (54) | 349823 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 459154 | |
| Lowercase Letter | 433362 | |
| Uppercase Letter | 127717 | 10.7% |
| Space Separator | 72279 | 6.1% |
| Decimal Number | 51569 | 4.3% |
| Other Punctuation | 16510 | 1.4% |
| Math Symbol | 14320 | 1.2% |
| Connector Punctuation | 12169 | 1.0% |
| Close Punctuation | 3409 | 0.3% |
| Open Punctuation | 3409 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 57580 | |
| i | 55946 | |
| e | 47423 | |
| t | 38653 | |
| l | 31960 | 7.4% |
| h | 26705 | 6.2% |
| o | 24609 | 5.7% |
| a | 24339 | 5.6% |
| g | 21197 | 4.9% |
| n | 19989 | 4.6% |
| Other values (13) | 84961 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 29766 | |
| C | 15831 | |
| T | 15792 | |
| S | 15084 | |
| P | 11757 | 9.2% |
| N | 9359 | 7.3% |
| L | 7845 | 6.1% |
| E | 6021 | 4.7% |
| G | 3605 | 2.8% |
| M | 3597 | 2.8% |
| Other values (8) | 9060 | 7.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 19153 | |
| 0 | 12806 | |
| 2 | 7001 | 13.6% |
| 3 | 5851 | 11.3% |
| 5 | 2688 | 5.2% |
| 6 | 2229 | 4.3% |
| 4 | 997 | 1.9% |
| 8 | 721 | 1.4% |
| 7 | 108 | 0.2% |
| 9 | 15 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 12857 | |
| > | 731 | 5.1% |
| < | 727 | 5.1% |
| + | 5 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 5710 | |
| , | 5004 | |
| : | 4356 | |
| / | 1440 | 8.7% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 459154 |
Space Separator
| Value | Count | Frequency (%) |
| 72279 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 12169 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 3409 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3409 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 632819 | |
| Latin | 561079 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 57580 | 10.3% |
| i | 55946 | 10.0% |
| e | 47423 | 8.5% |
| t | 38653 | 6.9% |
| l | 31960 | 5.7% |
| A | 29766 | 5.3% |
| h | 26705 | 4.8% |
| o | 24609 | 4.4% |
| a | 24339 | 4.3% |
| g | 21197 | 3.8% |
| Other values (31) | 202901 |
Common
| Value | Count | Frequency (%) |
| - | 459154 | |
| 72279 | 11.4% | |
| 1 | 19153 | 3.0% |
| = | 12857 | 2.0% |
| 0 | 12806 | 2.0% |
| _ | 12169 | 1.9% |
| 2 | 7001 | 1.1% |
| 3 | 5851 | 0.9% |
| . | 5710 | 0.9% |
| , | 5004 | 0.8% |
| Other values (13) | 20835 | 3.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1193898 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 459154 | |
| 72279 | 6.1% | |
| r | 57580 | 4.8% |
| i | 55946 | 4.7% |
| e | 47423 | 4.0% |
| t | 38653 | 3.2% |
| l | 31960 | 2.7% |
| A | 29766 | 2.5% |
| h | 26705 | 2.2% |
| o | 24609 | 2.1% |
| Other values (54) | 349823 |
toxval_numeric
Real number (ℝ)
HIGH CORRELATION  SKEWED 
| Distinct | 25718 |
|---|---|
| Distinct (%) | 5.3% |
| Missing | 2013 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 48919.83 |
| Minimum | -3.42 |
|---|---|
| Maximum | 2.5 × 109 |
| Zeros | 155 |
| Zeros (%) | < 0.1% |
| Negative | 4 |
| Negative (%) | < 0.1% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | -3.42 |
|---|---|
| 5-th percentile | 0.016 |
| Q1 | 4.2 |
| median | 105 |
| Q3 | 1149 |
| 95-th percentile | 7500 |
| Maximum | 2.5 × 109 |
| Range | 2.5 × 109 |
| Interquartile range (IQR) | 1144.8 |
Descriptive statistics
| Standard deviation | 7877764.9 |
|---|---|
| Coefficient of variation (CV) | 161.03418 |
| Kurtosis | 50781.33 |
| Mean | 48919.83 |
| Median Absolute Deviation (MAD) | 104.974 |
| Skewness | 218.40091 |
| Sum | 2.3799938 × 1010 |
| Variance | 6.2059179 × 1013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2000 | 26551 | 5.4% |
| 1000 | 19665 | 4.0% |
| 5000 | 14260 | 2.9% |
| 100 | 9245 | 1.9% |
| 500 | 8494 | 1.7% |
| 50 | 6328 | 1.3% |
| 10 | 5839 | 1.2% |
| 1 | 5707 | 1.2% |
| 200 | 5440 | 1.1% |
| 300 | 5282 | 1.1% |
| Other values (25708) | 379698 |
| Value | Count | Frequency (%) |
| -3.42 | 2 | < 0.1% |
| -3.34 | 2 | < 0.1% |
| 0 | 155 | |
| 1.76 × 10-18 | 1 | < 0.1% |
| 1.4 × 10-13 | 1 | < 0.1% |
| 2.13 × 10-11 | 1 | < 0.1% |
| 5 × 10-11 | 1 | < 0.1% |
| 7.4 × 10-11 | 2 | < 0.1% |
| 1 × 10-10 | 1 | < 0.1% |
| 1.16 × 10-10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2500000000 | 1 | < 0.1% |
| 1800000000 | 1 | < 0.1% |
| 1500000000 | 9 | |
| 430000000 | 1 | < 0.1% |
| 250000000 | 1 | < 0.1% |
| 190000000 | 1 | < 0.1% |
| 180000000 | 1 | < 0.1% |
| 120000000 | 1 | < 0.1% |
| 110000000 | 1 | < 0.1% |
| 100000000 | 5 |
toxval_numeric_original
Real number (ℝ)
HIGH CORRELATION  SKEWED 
| Distinct | 17596 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 2013 |
| Missing (%) | 0.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20377.644 |
| Minimum | -3.42 |
|---|---|
| Maximum | 2.5 × 109 |
| Zeros | 155 |
| Zeros (%) | < 0.1% |
| Negative | 4 |
| Negative (%) | < 0.1% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | -3.42 |
|---|---|
| 5-th percentile | 0.02 |
| Q1 | 4.49 |
| median | 105 |
| Q3 | 1135 |
| 95-th percentile | 7460 |
| Maximum | 2.5 × 109 |
| Range | 2.5 × 109 |
| Interquartile range (IQR) | 1130.51 |
Descriptive statistics
| Standard deviation | 4520602.5 |
|---|---|
| Coefficient of variation (CV) | 221.84128 |
| Kurtosis | 244120.58 |
| Mean | 20377.644 |
| Median Absolute Deviation (MAD) | 104.97 |
| Skewness | 480.03067 |
| Sum | 9.9139071 × 109 |
| Variance | 2.0435847 × 1013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2000 | 26969 | 5.5% |
| 1000 | 20350 | 4.2% |
| 5000 | 14642 | 3.0% |
| 100 | 9844 | 2.0% |
| 500 | 8782 | 1.8% |
| 50 | 6500 | 1.3% |
| 10 | 6214 | 1.3% |
| 1 | 5968 | 1.2% |
| 200 | 5806 | 1.2% |
| 300 | 5516 | 1.1% |
| Other values (17586) | 375918 |
| Value | Count | Frequency (%) |
| -3.42 | 2 | < 0.1% |
| -3.34 | 2 | < 0.1% |
| 0 | 155 | |
| 1.76 × 10-18 | 1 | < 0.1% |
| 1.4 × 10-13 | 1 | < 0.1% |
| 2.13 × 10-11 | 1 | < 0.1% |
| 5 × 10-11 | 1 | < 0.1% |
| 1 × 10-10 | 1 | < 0.1% |
| 1.16 × 10-10 | 1 | < 0.1% |
| 2 × 10-10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2500000000 | 1 | < 0.1% |
| 1800000000 | 1 | < 0.1% |
| 430000000 | 1 | < 0.1% |
| 250000000 | 1 | < 0.1% |
| 190000000 | 1 | < 0.1% |
| 180000000 | 1 | < 0.1% |
| 120000000 | 1 | < 0.1% |
| 110000000 | 1 | < 0.1% |
| 100000000 | 5 | |
| 66000000 | 1 | < 0.1% |
toxval_numeric_converted
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 488522 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 3.7 MiB |
toxval_numeric_standard
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 488522 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 3.7 MiB |
toxval_numeric_human
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 488522 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 3.7 MiB |
toxval_units
Text
| Distinct | 226 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 36 |
|---|---|
| Median length | 34 |
| Mean length | 5.97935 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2921044 |
|---|---|
| Distinct characters | 62 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 96 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | mg/kg-day |
| 3rd row | - |
| 4th row | mg/kg-day |
| 5th row | mg/kg-day |
| Value | Count | Frequency (%) |
| mg/kg-day | 148537 | |
| mg/kg | 134248 | |
| mg/l | 111104 | |
| mg/m3 | 61476 | |
| 9838 | 2.0% | |
| ppm | 8427 | 1.7% |
| ml/kg | 4608 | 0.9% |
| bw | 3659 | 0.7% |
| unitless | 2548 | 0.5% |
| mg/kg-day)-1 | 1554 | 0.3% |
| Other values (237) | 8337 | 1.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| g | 751402 | |
| m | 538548 | |
| / | 466844 | |
| k | 289327 | 9.9% |
| - | 160669 | 5.5% |
| d | 152417 | 5.2% |
| a | 152339 | 5.2% |
| y | 150938 | 5.2% |
| L | 117273 | 4.0% |
| 3 | 62640 | 2.1% |
| Other values (52) | 78647 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2097225 | |
| Other Punctuation | 468953 | 16.1% |
| Dash Punctuation | 160669 | 5.5% |
| Uppercase Letter | 117971 | 4.0% |
| Decimal Number | 65336 | 2.2% |
| Space Separator | 5816 | 0.2% |
| Open Punctuation | 2537 | 0.1% |
| Close Punctuation | 2537 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| g | 751402 | |
| m | 538548 | |
| k | 289327 | 13.8% |
| d | 152417 | 7.3% |
| a | 152339 | 7.3% |
| y | 150938 | 7.2% |
| p | 18548 | 0.9% |
| e | 6982 | 0.3% |
| s | 5805 | 0.3% |
| i | 5539 | 0.3% |
| Other values (14) | 25380 | 1.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 117273 | |
| M | 383 | 0.3% |
| N | 80 | 0.1% |
| I | 45 | < 0.1% |
| U | 43 | < 0.1% |
| D | 29 | < 0.1% |
| T | 23 | < 0.1% |
| A | 22 | < 0.1% |
| C | 20 | < 0.1% |
| W | 16 | < 0.1% |
| Other values (10) | 37 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 62640 | |
| 1 | 2583 | 4.0% |
| 2 | 67 | 0.1% |
| 0 | 26 | < 0.1% |
| 4 | 6 | < 0.1% |
| 6 | 5 | < 0.1% |
| 5 | 3 | < 0.1% |
| 7 | 3 | < 0.1% |
| 9 | 2 | < 0.1% |
| 8 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 466844 | |
| % | 2046 | 0.4% |
| ; | 59 | < 0.1% |
| . | 4 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 160669 |
Space Separator
| Value | Count | Frequency (%) |
| 5816 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2537 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2537 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2215196 | |
| Common | 705848 | 24.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| g | 751402 | |
| m | 538548 | |
| k | 289327 | 13.1% |
| d | 152417 | 6.9% |
| a | 152339 | 6.9% |
| y | 150938 | 6.8% |
| L | 117273 | 5.3% |
| p | 18548 | 0.8% |
| e | 6982 | 0.3% |
| s | 5805 | 0.3% |
| Other values (34) | 31617 | 1.4% |
Common
| Value | Count | Frequency (%) |
| / | 466844 | |
| - | 160669 | 22.8% |
| 3 | 62640 | 8.9% |
| 5816 | 0.8% | |
| 1 | 2583 | 0.4% |
| ( | 2537 | 0.4% |
| ) | 2537 | 0.4% |
| % | 2046 | 0.3% |
| 2 | 67 | < 0.1% |
| ; | 59 | < 0.1% |
| Other values (8) | 50 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2921044 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| g | 751402 | |
| m | 538548 | |
| / | 466844 | |
| k | 289327 | 9.9% |
| - | 160669 | 5.5% |
| d | 152417 | 5.2% |
| a | 152339 | 5.2% |
| y | 150938 | 5.2% |
| L | 117273 | 4.0% |
| 3 | 62640 | 2.1% |
| Other values (52) | 78647 | 2.7% |
| Distinct | 836 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 3.7 MiB |
Length
| Max length | 255 |
|---|---|
| Median length | 164 |
| Mean length | 8.565075 |
| Min length | 1 |
Characters and Unicode
| Total characters | 4184219 |
|---|---|
| Distinct characters | 78 |
| Distinct categories | 10 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 413 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | mg/kg bw/day (nominal) |
| 3rd row | - |
| 4th row | mg/kg bw/day (nominal) |
| 5th row | ppm |
| Value | Count | Frequency (%) |
| mg/kg | 205420 | |
| mg/l | 110084 | |
| bw/day | 69330 | 9.2% |
| bw | 67435 | 9.0% |
| mg/kg-day | 64836 | 8.6% |
| mg/m3 | 35415 | 4.7% |
| nominal | 28457 | 3.8% |
| ppm | 25911 | 3.5% |
| air | 23693 | 3.2% |
| dose | 18554 | 2.5% |
| Other values (856) | 100731 |
Most occurring characters
| Value | Count | Frequency (%) |
| g | 726616 | |
| m | 531692 | |
| / | 520248 | |
| k | 284307 | 6.8% |
| 261375 | 6.2% | |
| a | 255726 | 6.1% |
| d | 180369 | 4.3% |
| y | 149058 | 3.6% |
| b | 142241 | 3.4% |
| w | 140809 | 3.4% |
| Other values (68) | 991778 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3027402 | |
| Other Punctuation | 523230 | 12.5% |
| Space Separator | 261375 | 6.2% |
| Uppercase Letter | 132993 | 3.2% |
| Dash Punctuation | 79494 | 1.9% |
| Close Punctuation | 57581 | 1.4% |
| Open Punctuation | 57579 | 1.4% |
| Decimal Number | 44472 | 1.1% |
| Modifier Symbol | 83 | < 0.1% |
| Math Symbol | 10 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| g | 726616 | |
| m | 531692 | |
| k | 284307 | 9.4% |
| a | 255726 | 8.4% |
| d | 180369 | 6.0% |
| y | 149058 | 4.9% |
| b | 142241 | 4.7% |
| w | 140809 | 4.7% |
| i | 85734 | 2.8% |
| e | 82141 | 2.7% |
| Other values (16) | 448709 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 123089 | |
| A | 3792 | 2.9% |
| D | 2142 | 1.6% |
| M | 1315 | 1.0% |
| O | 673 | 0.5% |
| C | 526 | 0.4% |
| H | 501 | 0.4% |
| E | 495 | 0.4% |
| N | 136 | 0.1% |
| W | 65 | < 0.1% |
| Other values (14) | 259 | 0.2% |
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 40725 | |
| 1 | 2613 | 5.9% |
| 0 | 329 | 0.7% |
| 2 | 266 | 0.6% |
| 5 | 130 | 0.3% |
| 8 | 109 | 0.2% |
| 4 | 102 | 0.2% |
| 7 | 73 | 0.2% |
| 6 | 63 | 0.1% |
| 9 | 62 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 520248 | |
| % | 2055 | 0.4% |
| ; | 608 | 0.1% |
| . | 149 | < 0.1% |
| , | 135 | < 0.1% |
| : | 17 | < 0.1% |
| ? | 15 | < 0.1% |
| * | 2 | < 0.1% |
| & | 1 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 3 | |
| > | 3 | |
| ~ | 2 | |
| + | 2 |
Space Separator
| Value | Count | Frequency (%) |
| 261375 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 79494 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 57581 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 57579 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 83 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3160395 | |
| Common | 1023824 | 24.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| g | 726616 | |
| m | 531692 | |
| k | 284307 | 9.0% |
| a | 255726 | 8.1% |
| d | 180369 | 5.7% |
| y | 149058 | 4.7% |
| b | 142241 | 4.5% |
| w | 140809 | 4.5% |
| L | 123089 | 3.9% |
| i | 85734 | 2.7% |
| Other values (40) | 540754 |
Common
| Value | Count | Frequency (%) |
| / | 520248 | |
| 261375 | ||
| - | 79494 | 7.8% |
| ) | 57581 | 5.6% |
| ( | 57579 | 5.6% |
| 3 | 40725 | 4.0% |
| 1 | 2613 | 0.3% |
| % | 2055 | 0.2% |
| ; | 608 | 0.1% |
| 0 | 329 | < 0.1% |
| Other values (18) | 1217 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4184219 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| g | 726616 | |
| m | 531692 | |
| / | 520248 | |
| k | 284307 | 6.8% |
| 261375 | 6.2% | |
| a | 255726 | 6.1% |
| d | 180369 | 4.3% |
| y | 149058 | 3.6% |
| b | 142241 | 3.4% |
| w | 140809 | 3.4% |
| Other values (68) | 991778 |
toxval_units_converted
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488522 |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488522 |
toxval_units_standard
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488522 |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488522 |
toxval_units_human
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488522 |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488522 |
toxval_numeric_qualifier
Categorical
HIGH CORRELATION  IMBALANCE  MISSING 
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 13610 |
| Missing (%) | 2.8% |
| Memory size | 3.7 MiB |
| = | |
|---|---|
| > | |
| >= | 16342 |
| ~ | 10750 |
| < | 7530 |
| Other values (4) | 502 |
Length
| Max length | 76 |
|---|---|
| Median length | 1 |
| Mean length | 1.0411024 |
| Min length | 1 |
Characters and Unicode
| Total characters | 494432 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | ~ |
|---|---|
| 2nd row | = |
| 3rd row | = |
| 4th row | = |
| 5th row | = |
Common Values
| Value | Count | Frequency (%) |
| = | 361959 | |
| > | 77829 | 15.9% |
| >= | 16342 | 3.3% |
| ~ | 10750 | 2.2% |
| < | 7530 | 1.5% |
| <= | 461 | 0.1% |
| A value within a wider than usual range, adopted for classification purposes | 36 | < 0.1% |
| >= <= | 4 | < 0.1% |
| ~< | 1 | < 0.1% |
| (Missing) | 13610 | 2.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 474880 | ||
| a | 72 | < 0.1% |
| value | 36 | < 0.1% |
| within | 36 | < 0.1% |
| wider | 36 | < 0.1% |
| than | 36 | < 0.1% |
| usual | 36 | < 0.1% |
| range | 36 | < 0.1% |
| adopted | 36 | < 0.1% |
| for | 36 | < 0.1% |
| Other values (2) | 72 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| = | 378770 | |
| > | 94175 | 19.0% |
| ~ | 10751 | 2.2% |
| < | 7996 | 1.6% |
| 400 | 0.1% | |
| a | 288 | 0.1% |
| i | 216 | < 0.1% |
| s | 180 | < 0.1% |
| e | 180 | < 0.1% |
| u | 144 | < 0.1% |
| Other values (15) | 1332 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Math Symbol | 491692 | |
| Lowercase Letter | 2268 | 0.5% |
| Space Separator | 400 | 0.1% |
| Other Punctuation | 36 | < 0.1% |
| Uppercase Letter | 36 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 288 | |
| i | 216 | 9.5% |
| s | 180 | 7.9% |
| e | 180 | 7.9% |
| u | 144 | 6.3% |
| t | 144 | 6.3% |
| o | 144 | 6.3% |
| n | 144 | 6.3% |
| r | 144 | 6.3% |
| d | 108 | 4.8% |
| Other values (8) | 576 |
Math Symbol
| Value | Count | Frequency (%) |
| = | 378770 | |
| > | 94175 | 19.2% |
| ~ | 10751 | 2.2% |
| < | 7996 | 1.6% |
Space Separator
| Value | Count | Frequency (%) |
| 400 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 36 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 36 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 492128 | |
| Latin | 2304 | 0.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 288 | |
| i | 216 | 9.4% |
| s | 180 | 7.8% |
| e | 180 | 7.8% |
| u | 144 | 6.2% |
| t | 144 | 6.2% |
| o | 144 | 6.2% |
| n | 144 | 6.2% |
| r | 144 | 6.2% |
| d | 108 | 4.7% |
| Other values (9) | 612 |
Common
| Value | Count | Frequency (%) |
| = | 378770 | |
| > | 94175 | 19.1% |
| ~ | 10751 | 2.2% |
| < | 7996 | 1.6% |
| 400 | 0.1% | |
| , | 36 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 494432 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| = | 378770 | |
| > | 94175 | 19.0% |
| ~ | 10751 | 2.2% |
| < | 7996 | 1.6% |
| 400 | 0.1% | |
| a | 288 | 0.1% |
| i | 216 | < 0.1% |
| s | 180 | < 0.1% |
| e | 180 | < 0.1% |
| u | 144 | < 0.1% |
| Other values (15) | 1332 | 0.3% |
toxval_numeric_qualifier_original
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| = | |
| > | |
| >= | 16342 |
| ca. | 10345 |
| Other values (8) | 8450 |
Length
| Max length | 76 |
|---|---|
| Median length | 1 |
| Mean length | 1.0857751 |
| Min length | 1 |
Characters and Unicode
| Total characters | 530425 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | ca. |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 267542 | |
| = | 108014 | |
| > | 77829 | 15.9% |
| >= | 16342 | 3.3% |
| ca. | 10345 | 2.1% |
| < | 7530 | 1.5% |
| <= | 461 | 0.1% |
| circa | 403 | 0.1% |
| A value within a wider than usual range, adopted for classification purposes | 36 | < 0.1% |
| between | 13 | < 0.1% |
| Other values (3) | 7 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| 477729 | ||
| ca | 10346 | 2.1% |
| circa | 403 | 0.1% |
| a | 72 | < 0.1% |
| value | 36 | < 0.1% |
| within | 36 | < 0.1% |
| wider | 36 | < 0.1% |
| than | 36 | < 0.1% |
| usual | 36 | < 0.1% |
| range | 36 | < 0.1% |
| Other values (5) | 157 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 267542 | |
| = | 124825 | |
| > | 94175 | 17.8% |
| c | 11224 | 2.1% |
| a | 11037 | 2.1% |
| . | 10346 | 2.0% |
| < | 7996 | 1.5% |
| i | 619 | 0.1% |
| r | 547 | 0.1% |
| 401 | 0.1% | |
| Other values (18) | 1713 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 267542 | |
| Math Symbol | 226998 | |
| Lowercase Letter | 25066 | 4.7% |
| Other Punctuation | 10382 | 2.0% |
| Space Separator | 401 | 0.1% |
| Uppercase Letter | 36 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 11224 | |
| a | 11037 | |
| i | 619 | 2.5% |
| r | 547 | 2.2% |
| e | 219 | 0.9% |
| s | 180 | 0.7% |
| t | 157 | 0.6% |
| n | 157 | 0.6% |
| o | 144 | 0.6% |
| u | 144 | 0.6% |
| Other values (9) | 638 | 2.5% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 124825 | |
| > | 94175 | |
| < | 7996 | 3.5% |
| ~ | 2 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 10346 | |
| , | 36 | 0.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 267542 |
Space Separator
| Value | Count | Frequency (%) |
| 401 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 36 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 505323 | |
| Latin | 25102 | 4.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| c | 11224 | |
| a | 11037 | |
| i | 619 | 2.5% |
| r | 547 | 2.2% |
| e | 219 | 0.9% |
| s | 180 | 0.7% |
| t | 157 | 0.6% |
| n | 157 | 0.6% |
| o | 144 | 0.6% |
| u | 144 | 0.6% |
| Other values (10) | 674 | 2.7% |
Common
| Value | Count | Frequency (%) |
| - | 267542 | |
| = | 124825 | |
| > | 94175 | 18.6% |
| . | 10346 | 2.0% |
| < | 7996 | 1.6% |
| 401 | 0.1% | |
| , | 36 | < 0.1% |
| ~ | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 530425 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 267542 | |
| = | 124825 | |
| > | 94175 | 17.8% |
| c | 11224 | 2.1% |
| a | 11037 | 2.1% |
| . | 10346 | 2.0% |
| < | 7996 | 1.5% |
| i | 619 | 0.1% |
| r | 547 | 0.1% |
| 401 | 0.1% | |
| Other values (18) | 1713 | 0.3% |
study_type
Categorical
HIGH CORRELATION 
| Distinct | 25 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| acute | |
|---|---|
| chronic | |
| subchronic | |
| developmental | |
| short-term | |
| Other values (20) |
Length
| Max length | 26 |
|---|---|
| Median length | 5 |
| Mean length | 6.9263841 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3383691 |
|---|---|
| Distinct characters | 23 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | short-term |
|---|---|
| 2nd row | short-term |
| 3rd row | short-term |
| 4th row | short-term |
| 5th row | subchronic |
Common Values
| Value | Count | Frequency (%) |
| acute | 260638 | |
| chronic | 66761 | 13.7% |
| subchronic | 44563 | 9.1% |
| developmental | 34008 | 7.0% |
| short-term | 30217 | 6.2% |
| - | 18765 | 3.8% |
| reproduction | 15097 | 3.1% |
| noncancer | 4968 | 1.0% |
| genotoxicity | 4848 | 1.0% |
| neurotoxicity | 2574 | 0.5% |
| Other values (15) | 6083 | 1.2% |
Length
| Value | Count | Frequency (%) |
| acute | 260695 | |
| chronic | 66768 | 13.5% |
| subchronic | 44594 | 9.0% |
| developmental | 34391 | 7.0% |
| short-term | 30252 | 6.1% |
| 18765 | 3.8% | |
| reproduction | 15480 | 3.1% |
| noncancer | 4968 | 1.0% |
| genotoxicity | 4848 | 1.0% |
| neurotoxicity | 2704 | 0.5% |
| Other values (11) | 10512 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 519094 | |
| e | 434783 | |
| t | 394494 | |
| u | 325484 | |
| a | 303494 | |
| o | 235482 | |
| r | 218623 | |
| n | 184827 | 5.5% |
| h | 145839 | 4.3% |
| i | 145672 | 4.3% |
| Other values (13) | 475899 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3328903 | |
| Dash Punctuation | 49017 | 1.4% |
| Space Separator | 5455 | 0.2% |
| Uppercase Letter | 316 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 519094 | |
| e | 434783 | |
| t | 394494 | |
| u | 325484 | |
| a | 303494 | |
| o | 235482 | |
| r | 218623 | |
| n | 184827 | 5.6% |
| h | 145839 | 4.4% |
| i | 145672 | 4.4% |
| Other values (10) | 421111 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 49017 |
Space Separator
| Value | Count | Frequency (%) |
| 5455 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 316 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3329219 | |
| Common | 54472 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| c | 519094 | |
| e | 434783 | |
| t | 394494 | |
| u | 325484 | |
| a | 303494 | |
| o | 235482 | |
| r | 218623 | |
| n | 184827 | 5.6% |
| h | 145839 | 4.4% |
| i | 145672 | 4.4% |
| Other values (11) | 421427 |
Common
| Value | Count | Frequency (%) |
| - | 49017 | |
| 5455 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3383691 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 519094 | |
| e | 434783 | |
| t | 394494 | |
| u | 325484 | |
| a | 303494 | |
| o | 235482 | |
| r | 218623 | |
| n | 184827 | 5.5% |
| h | 145839 | 4.3% |
| i | 145672 | 4.3% |
| Other values (13) | 475899 |
| Distinct | 95 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 99 |
|---|---|
| Median length | 98 |
| Mean length | 11.071372 |
| Min length | 1 |
Characters and Unicode
| Total characters | 5408609 |
|---|---|
| Distinct characters | 58 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 6 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | short-term repeated dose toxicity |
|---|---|
| 2nd row | short-term repeated dose toxicity |
| 3rd row | short-term repeated dose toxicity |
| 4th row | short-term repeated dose toxicity |
| 5th row | sub-chronic toxicity |
| Value | Count | Frequency (%) |
| acute | 247357 | |
| toxicity | 165452 | |
| 45010 | 6.1% | |
| chronic | 44905 | 6.1% |
| developmental | 34135 | 4.6% |
| dose | 25995 | 3.5% |
| repeated | 25845 | 3.5% |
| short-term | 24617 | 3.3% |
| subchronic | 22835 | 3.1% |
| sub-chronic | 21242 | 2.9% |
| Other values (81) | 79371 | 10.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 743218 | |
| c | 675308 | |
| e | 567311 | |
| i | 493056 | |
| o | 400065 | 7.4% |
| a | 337372 | 6.2% |
| u | 318130 | 5.9% |
| 248242 | 4.6% | |
| r | 239135 | 4.4% |
| n | 190057 | 3.5% |
| Other values (48) | 1196715 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4987224 | |
| Space Separator | 248242 | 4.6% |
| Dash Punctuation | 95765 | 1.8% |
| Uppercase Letter | 62390 | 1.2% |
| Decimal Number | 6479 | 0.1% |
| Other Punctuation | 5196 | 0.1% |
| Open Punctuation | 1133 | < 0.1% |
| Close Punctuation | 1133 | < 0.1% |
| Math Symbol | 1047 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 743218 | |
| c | 675308 | |
| e | 567311 | |
| i | 493056 | |
| o | 400065 | |
| a | 337372 | |
| u | 318130 | 6.4% |
| r | 239135 | 4.8% |
| n | 190057 | 3.8% |
| y | 183173 | 3.7% |
| Other values (12) | 840399 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 9739 | |
| C | 7009 | |
| T | 6106 | |
| A | 5383 | 8.6% |
| G | 4848 | 7.8% |
| E | 3461 | 5.5% |
| D | 3282 | 5.3% |
| M | 3258 | 5.2% |
| R | 3191 | 5.1% |
| I | 2632 | 4.2% |
| Other values (10) | 13481 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3188 | |
| 1 | 1438 | |
| 3 | 1041 | 16.1% |
| 9 | 630 | 9.7% |
| 2 | 91 | 1.4% |
| 8 | 71 | 1.1% |
| 4 | 20 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 4660 | |
| , | 479 | 9.2% |
| : | 57 | 1.1% |
Math Symbol
| Value | Count | Frequency (%) |
| < | 975 | |
| > | 72 | 6.9% |
Space Separator
| Value | Count | Frequency (%) |
| 248242 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 95765 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1133 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1133 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5049614 | |
| Common | 358995 | 6.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 743218 | |
| c | 675308 | |
| e | 567311 | |
| i | 493056 | |
| o | 400065 | |
| a | 337372 | 6.7% |
| u | 318130 | 6.3% |
| r | 239135 | 4.7% |
| n | 190057 | 3.8% |
| y | 183173 | 3.6% |
| Other values (32) | 902789 |
Common
| Value | Count | Frequency (%) |
| 248242 | ||
| - | 95765 | 26.7% |
| / | 4660 | 1.3% |
| 0 | 3188 | 0.9% |
| 1 | 1438 | 0.4% |
| ( | 1133 | 0.3% |
| ) | 1133 | 0.3% |
| 3 | 1041 | 0.3% |
| < | 975 | 0.3% |
| 9 | 630 | 0.2% |
| Other values (6) | 790 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5408609 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 743218 | |
| c | 675308 | |
| e | 567311 | |
| i | 493056 | |
| o | 400065 | 7.4% |
| a | 337372 | 6.2% |
| u | 318130 | 5.9% |
| 248242 | 4.6% | |
| r | 239135 | 4.4% |
| n | 190057 | 3.5% |
| Other values (48) | 1196715 |
study_duration_class
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 32 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| terminal | |
| second mating | 1876 |
| interim | 1794 |
| chronic | 1775 |
| Other values (27) | 4085 |
Length
| Max length | 27 |
|---|---|
| Median length | 1 |
| Mean length | 1.876894 |
| Min length | 1 |
Characters and Unicode
| Total characters | 916904 |
|---|---|
| Distinct characters | 29 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 428643 | |
| terminal | 50349 | 10.3% |
| second mating | 1876 | 0.4% |
| interim | 1794 | 0.4% |
| chronic | 1775 | 0.4% |
| subchronic | 1422 | 0.3% |
| interim1 | 511 | 0.1% |
| recovery | 509 | 0.1% |
| interim2 | 379 | 0.1% |
| satellite | 306 | 0.1% |
| Other values (22) | 958 | 0.2% |
Length
| Value | Count | Frequency (%) |
| 428643 | ||
| terminal | 50349 | 10.3% |
| second | 1876 | 0.4% |
| mating | 1876 | 0.4% |
| interim | 1794 | 0.4% |
| chronic | 1777 | 0.4% |
| subchronic | 1429 | 0.3% |
| interim1 | 511 | 0.1% |
| recovery | 509 | 0.1% |
| interim2 | 379 | 0.1% |
| Other values (22) | 1269 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 428645 | |
| i | 62079 | 6.8% |
| n | 60658 | 6.6% |
| r | 58038 | 6.3% |
| e | 57579 | 6.3% |
| t | 56423 | 6.2% |
| m | 55614 | 6.1% |
| a | 53133 | 5.8% |
| l | 51487 | 5.6% |
| c | 8960 | 1.0% |
| Other values (19) | 24288 | 2.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 484802 | |
| Dash Punctuation | 428645 | |
| Space Separator | 1890 | 0.2% |
| Decimal Number | 1560 | 0.2% |
| Other Punctuation | 7 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 62079 | |
| n | 60658 | |
| r | 58038 | |
| e | 57579 | |
| t | 56423 | |
| m | 55614 | |
| a | 53133 | |
| l | 51487 | |
| c | 8960 | 1.8% |
| o | 5802 | 1.2% |
| Other values (10) | 15029 | 3.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 735 | |
| 2 | 586 | |
| 3 | 179 | 11.5% |
| 4 | 50 | 3.2% |
| 5 | 6 | 0.4% |
| 6 | 4 | 0.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 428645 |
Space Separator
| Value | Count | Frequency (%) |
| 1890 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 7 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 484802 | |
| Common | 432102 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 62079 | |
| n | 60658 | |
| r | 58038 | |
| e | 57579 | |
| t | 56423 | |
| m | 55614 | |
| a | 53133 | |
| l | 51487 | |
| c | 8960 | 1.8% |
| o | 5802 | 1.2% |
| Other values (10) | 15029 | 3.1% |
Common
| Value | Count | Frequency (%) |
| - | 428645 | |
| 1890 | 0.4% | |
| 1 | 735 | 0.2% |
| 2 | 586 | 0.1% |
| 3 | 179 | < 0.1% |
| 4 | 50 | < 0.1% |
| , | 7 | < 0.1% |
| 5 | 6 | < 0.1% |
| 6 | 4 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 916904 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 428645 | |
| i | 62079 | 6.8% |
| n | 60658 | 6.6% |
| r | 58038 | 6.3% |
| e | 57579 | 6.3% |
| t | 56423 | 6.2% |
| m | 55614 | 6.1% |
| a | 53133 | 5.8% |
| l | 51487 | 5.6% |
| c | 8960 | 1.0% |
| Other values (19) | 24288 | 2.6% |
| Distinct | 51 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 64 |
|---|---|
| Median length | 1 |
| Mean length | 1.8842836 |
| Min length | 1 |
Characters and Unicode
| Total characters | 920514 |
|---|---|
| Distinct characters | 38 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 428504 | ||
| terminal | 50349 | 10.3% |
| second | 1876 | 0.4% |
| mating | 1876 | 0.4% |
| interim | 1794 | 0.4% |
| chronic | 1791 | 0.4% |
| sub-chronic | 980 | 0.2% |
| interim1 | 511 | 0.1% |
| recovery | 509 | 0.1% |
| subchronic | 457 | 0.1% |
| Other values (50) | 1962 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 429493 | |
| i | 62253 | 6.8% |
| n | 60788 | 6.6% |
| r | 58290 | 6.3% |
| e | 57909 | 6.3% |
| t | 56703 | 6.2% |
| m | 55646 | 6.0% |
| a | 53257 | 5.8% |
| l | 51601 | 5.6% |
| c | 8797 | 1.0% |
| Other values (28) | 25777 | 2.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 485751 | |
| Dash Punctuation | 429493 | |
| Space Separator | 2087 | 0.2% |
| Decimal Number | 1560 | 0.2% |
| Uppercase Letter | 1522 | 0.2% |
| Other Punctuation | 35 | < 0.1% |
| Open Punctuation | 33 | < 0.1% |
| Close Punctuation | 33 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 62253 | |
| n | 60788 | |
| r | 58290 | |
| e | 57909 | |
| t | 56703 | |
| m | 55646 | |
| a | 53257 | |
| l | 51601 | |
| c | 8797 | 1.8% |
| o | 5963 | 1.2% |
| Other values (12) | 14544 | 3.0% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 735 | |
| 2 | 586 | |
| 3 | 179 | 11.5% |
| 4 | 50 | 3.2% |
| 5 | 6 | 0.4% |
| 6 | 4 | 0.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1047 | |
| C | 337 | 22.1% |
| O | 136 | 8.9% |
| A | 2 | 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 25 | |
| / | 10 | 28.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 429493 |
Space Separator
| Value | Count | Frequency (%) |
| 2087 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 33 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 33 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 487273 | |
| Common | 433241 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 62253 | |
| n | 60788 | |
| r | 58290 | |
| e | 57909 | |
| t | 56703 | |
| m | 55646 | |
| a | 53257 | |
| l | 51601 | |
| c | 8797 | 1.8% |
| o | 5963 | 1.2% |
| Other values (16) | 16066 | 3.3% |
Common
| Value | Count | Frequency (%) |
| - | 429493 | |
| 2087 | 0.5% | |
| 1 | 735 | 0.2% |
| 2 | 586 | 0.1% |
| 3 | 179 | < 0.1% |
| 4 | 50 | < 0.1% |
| ( | 33 | < 0.1% |
| ) | 33 | < 0.1% |
| , | 25 | < 0.1% |
| / | 10 | < 0.1% |
| Other values (2) | 10 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 920514 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 429493 | |
| i | 62253 | 6.8% |
| n | 60788 | 6.6% |
| r | 58290 | 6.3% |
| e | 57909 | 6.3% |
| t | 56703 | 6.2% |
| m | 55646 | 6.0% |
| a | 53257 | 5.8% |
| l | 51601 | 5.6% |
| c | 8797 | 1.0% |
| Other values (28) | 25777 | 2.8% |
study_duration_value
Real number (ℝ)
| Distinct | 573 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -386.41906 |
| Minimum | -999 |
|---|---|
| Maximum | 8640 |
| Zeros | 57 |
| Zeros (%) | < 0.1% |
| Negative | 196941 |
| Negative (%) | 40.3% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | -999 |
|---|---|
| 5-th percentile | -999 |
| Q1 | -999 |
| median | 1 |
| Q3 | 7 |
| 95-th percentile | 90 |
| Maximum | 8640 |
| Range | 9639 |
| Interquartile range (IQR) | 1006 |
Descriptive statistics
| Standard deviation | 508.26194 |
|---|---|
| Coefficient of variation (CV) | -1.3153128 |
| Kurtosis | -0.73338424 |
| Mean | -386.41906 |
| Median Absolute Deviation (MAD) | 41 |
| Skewness | -0.25860406 |
| Sum | -1.8877421 × 108 |
| Variance | 258330.2 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -999 | 196940 | |
| 1 | 66379 | 13.6% |
| 4 | 47810 | 9.8% |
| 2 | 37465 | 7.7% |
| 24 | 13424 | 2.7% |
| 13 | 12909 | 2.6% |
| 90 | 12290 | 2.5% |
| 3 | 9950 | 2.0% |
| 28 | 9074 | 1.9% |
| 14 | 7342 | 1.5% |
| Other values (563) | 74939 | 15.3% |
| Value | Count | Frequency (%) |
| -999 | 196940 | |
| -7 | 1 | < 0.1% |
| 0 | 57 | < 0.1% |
| 0.04 | 1 | < 0.1% |
| 0.0416667 | 4 | < 0.1% |
| 0.05 | 1 | < 0.1% |
| 0.08 | 1 | < 0.1% |
| 0.0833333 | 1 | < 0.1% |
| 0.125 | 2 | < 0.1% |
| 0.166667 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 8640 | 1 | < 0.1% |
| 6840 | 2 | < 0.1% |
| 6576 | 1 | < 0.1% |
| 6480 | 7 | |
| 5544 | 1 | < 0.1% |
| 3864 | 1 | < 0.1% |
| 3360 | 1 | < 0.1% |
| 3192 | 1 | < 0.1% |
| 3024 | 2 | < 0.1% |
| 2880 | 1 | < 0.1% |
study_duration_value_original
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
study_duration_units
Categorical
IMBALANCE 
| Distinct | 50 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| days | |
|---|---|
| - | |
| hours | |
| weeks | |
| generation | 11408 |
| Other values (45) | 17132 |
Length
| Max length | 255 |
|---|---|
| Median length | 216 |
| Mean length | 3.2248844 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1575427 |
|---|---|
| Distinct characters | 53 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 8 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | days |
|---|---|
| 2nd row | days |
| 3rd row | days |
| 4th row | days |
| 5th row | weeks |
Common Values
| Value | Count | Frequency (%) |
| days | 197816 | |
| - | 186765 | |
| hours | 42179 | 8.6% |
| weeks | 33222 | 6.8% |
| generation | 11408 | 2.3% |
| months | 8361 | 1.7% |
| years | 6199 | 1.3% |
| minutes | 1801 | 0.4% |
| gestational days | 112 | < 0.1% |
| Days | 109 | < 0.1% |
| Other values (40) | 550 | 0.1% |
Length
| Value | Count | Frequency (%) |
| days | 198152 | |
| 186772 | ||
| hours | 42190 | 8.6% |
| weeks | 33309 | 6.8% |
| generation | 11408 | 2.3% |
| months | 8368 | 1.7% |
| years | 6199 | 1.3% |
| minutes | 1807 | 0.4% |
| gd | 132 | < 0.1% |
| gestational | 112 | < 0.1% |
| Other values (155) | 1284 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| s | 290564 | |
| a | 216694 | |
| y | 204442 | |
| d | 198244 | |
| - | 186990 | |
| e | 98404 | 6.2% |
| o | 62615 | 4.0% |
| r | 60148 | 3.8% |
| h | 50704 | 3.2% |
| u | 44072 | 2.8% |
| Other values (43) | 162550 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1385907 | |
| Dash Punctuation | 186990 | 11.9% |
| Space Separator | 1211 | 0.1% |
| Decimal Number | 571 | < 0.1% |
| Uppercase Letter | 500 | < 0.1% |
| Other Punctuation | 184 | < 0.1% |
| Open Punctuation | 32 | < 0.1% |
| Close Punctuation | 32 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| s | 290564 | |
| a | 216694 | |
| y | 204442 | |
| d | 198244 | |
| e | 98404 | 7.1% |
| o | 62615 | 4.5% |
| r | 60148 | 4.3% |
| h | 50704 | 3.7% |
| u | 44072 | 3.2% |
| n | 33701 | 2.4% |
| Other values (15) | 126319 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 132 | |
| 6 | 111 | |
| 0 | 98 | |
| 2 | 81 | |
| 5 | 35 | 6.1% |
| 7 | 33 | 5.8% |
| 9 | 23 | 4.0% |
| 3 | 22 | 3.9% |
| 8 | 19 | 3.3% |
| 4 | 17 | 3.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 254 | |
| G | 141 | |
| W | 86 | 17.2% |
| L | 4 | 0.8% |
| P | 4 | 0.8% |
| N | 4 | 0.8% |
| H | 4 | 0.8% |
| M | 3 | 0.6% |
Other Punctuation
| Value | Count | Frequency (%) |
| " | 103 | |
| . | 54 | |
| / | 12 | 6.5% |
| , | 10 | 5.4% |
| : | 3 | 1.6% |
| ; | 2 | 1.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 186990 |
Space Separator
| Value | Count | Frequency (%) |
| 1211 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 32 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 32 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1386407 | |
| Common | 189020 | 12.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| s | 290564 | |
| a | 216694 | |
| y | 204442 | |
| d | 198244 | |
| e | 98404 | 7.1% |
| o | 62615 | 4.5% |
| r | 60148 | 4.3% |
| h | 50704 | 3.7% |
| u | 44072 | 3.2% |
| n | 33701 | 2.4% |
| Other values (23) | 126819 |
Common
| Value | Count | Frequency (%) |
| - | 186990 | |
| 1211 | 0.6% | |
| 1 | 132 | 0.1% |
| 6 | 111 | 0.1% |
| " | 103 | 0.1% |
| 0 | 98 | 0.1% |
| 2 | 81 | < 0.1% |
| . | 54 | < 0.1% |
| 5 | 35 | < 0.1% |
| 7 | 33 | < 0.1% |
| Other values (10) | 172 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1575427 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| s | 290564 | |
| a | 216694 | |
| y | 204442 | |
| d | 198244 | |
| - | 186990 | |
| e | 98404 | 6.2% |
| o | 62615 | 4.0% |
| r | 60148 | 3.8% |
| h | 50704 | 3.2% |
| u | 44072 | 2.8% |
| Other values (43) | 162550 |
| Distinct | 4713 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 255 |
|---|---|
| Median length | 254 |
| Mean length | 5.0842685 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2483777 |
|---|---|
| Distinct characters | 78 |
| Distinct categories | 11 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2200 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | days |
|---|---|
| 2nd row | range-finding: 14 days main study: males were dosed daily for 2 weeks prior to pairing, during the pairing period and a further 2 weeks before necropsy; a total of 6 weeks treatment prior to necropsy. females were dosed once daily for 2 weeks prior to pai |
| 3rd row | days |
| 4th row | days |
| 5th row | weeks |
| Value | Count | Frequency (%) |
| 187565 | ||
| day | 152043 | |
| days | 51989 | 7.5% |
| week | 22877 | 3.3% |
| weeks | 16296 | 2.3% |
| h | 16210 | 2.3% |
| hours | 15690 | 2.3% |
| generation | 11543 | 1.7% |
| hour | 10423 | 1.5% |
| the | 7346 | 1.1% |
| Other values (3454) | 203264 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 293132 | |
| d | 239474 | 9.6% |
| y | 220667 | 8.9% |
| 207141 | 8.3% | |
| e | 205328 | 8.3% |
| - | 191730 | 7.7% |
| s | 131711 | 5.3% |
| o | 113127 | 4.6% |
| r | 103408 | 4.2% |
| t | 93897 | 3.8% |
| Other values (68) | 684162 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1960171 | |
| Space Separator | 207141 | 8.3% |
| Dash Punctuation | 191730 | 7.7% |
| Decimal Number | 66188 | 2.7% |
| Other Punctuation | 30210 | 1.2% |
| Uppercase Letter | 15736 | 0.6% |
| Open Punctuation | 6318 | 0.3% |
| Close Punctuation | 6038 | 0.2% |
| Math Symbol | 243 | < 0.1% |
| Modifier Symbol | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 293132 | |
| d | 239474 | |
| y | 220667 | |
| e | 205328 | |
| s | 131711 | 6.7% |
| o | 113127 | 5.8% |
| r | 103408 | 5.3% |
| t | 93897 | 4.8% |
| n | 86390 | 4.4% |
| i | 72263 | 3.7% |
| Other values (16) | 400774 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 8241 | |
| D | 4272 | |
| W | 1814 | 11.5% |
| G | 488 | 3.1% |
| M | 319 | 2.0% |
| P | 146 | 0.9% |
| N | 145 | 0.9% |
| O | 121 | 0.8% |
| Y | 118 | 0.7% |
| F | 21 | 0.1% |
| Other values (7) | 51 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 10023 | |
| . | 8511 | |
| : | 6169 | |
| / | 3800 | 12.6% |
| ; | 1011 | 3.3% |
| ? | 330 | 1.1% |
| % | 144 | 0.5% |
| " | 103 | 0.3% |
| ' | 45 | 0.1% |
| * | 39 | 0.1% |
| Other values (2) | 35 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 11976 | |
| 2 | 11624 | |
| 4 | 10001 | |
| 0 | 7679 | |
| 3 | 5031 | |
| 5 | 4957 | |
| 6 | 4908 | |
| 9 | 3960 | 6.0% |
| 8 | 3917 | 5.9% |
| 7 | 2135 | 3.2% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 136 | |
| = | 65 | |
| > | 24 | 9.9% |
| ~ | 16 | 6.6% |
| < | 2 | 0.8% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 6290 | |
| [ | 28 | 0.4% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 6021 | |
| ] | 17 | 0.3% |
Space Separator
| Value | Count | Frequency (%) |
| 207141 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 191730 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 1 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1975907 | |
| Common | 507870 | 20.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 293132 | |
| d | 239474 | |
| y | 220667 | |
| e | 205328 | |
| s | 131711 | 6.7% |
| o | 113127 | 5.7% |
| r | 103408 | 5.2% |
| t | 93897 | 4.8% |
| n | 86390 | 4.4% |
| i | 72263 | 3.7% |
| Other values (33) | 416510 |
Common
| Value | Count | Frequency (%) |
| 207141 | ||
| - | 191730 | |
| 1 | 11976 | 2.4% |
| 2 | 11624 | 2.3% |
| , | 10023 | 2.0% |
| 4 | 10001 | 2.0% |
| . | 8511 | 1.7% |
| 0 | 7679 | 1.5% |
| ( | 6290 | 1.2% |
| : | 6169 | 1.2% |
| Other values (25) | 36726 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2483777 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 293132 | |
| d | 239474 | 9.6% |
| y | 220667 | 8.9% |
| 207141 | 8.3% | |
| e | 205328 | 8.3% |
| - | 191730 | 7.7% |
| s | 131711 | 5.3% |
| o | 113127 | 4.6% |
| r | 103408 | 4.2% |
| t | 93897 | 3.8% |
| Other values (68) | 684162 |
species_id
Real number (ℝ)
HIGH CORRELATION 
| Distinct | 1891 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 169870.9 |
| Minimum | 1 |
|---|---|
| Maximum | 6000002 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 4510 |
| median | 4510 |
| Q3 | 7630 |
| 95-th percentile | 1000000 |
| Maximum | 6000002 |
| Range | 6000001 |
| Interquartile range (IQR) | 3120 |
Descriptive statistics
| Standard deviation | 423699.74 |
|---|---|
| Coefficient of variation (CV) | 2.4942455 |
| Kurtosis | 25.431247 |
| Mean | 169870.9 |
| Median Absolute Deviation (MAD) | 403 |
| Skewness | 3.6687357 |
| Sum | 8.2985673 × 1010 |
| Variance | 1.7952147 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4510 | 214271 | |
| 1000000 | 67507 | 13.8% |
| 4913 | 60666 | 12.4% |
| 22808 | 25338 | 5.2% |
| 5 | 14651 | 3.0% |
| 7630 | 10120 | 2.1% |
| 4 | 8709 | 1.8% |
| 1 | 8163 | 1.7% |
| 2 | 5489 | 1.1% |
| 58471 | 5006 | 1.0% |
| Other values (1881) | 68602 | 14.0% |
| Value | Count | Frequency (%) |
| 1 | 8163 | |
| 2 | 5489 | 1.1% |
| 3 | 518 | 0.1% |
| 4 | 8709 | |
| 5 | 14651 | |
| 6 | 121 | < 0.1% |
| 7 | 219 | < 0.1% |
| 8 | 1179 | 0.2% |
| 10 | 24 | < 0.1% |
| 11 | 82 | < 0.1% |
| Value | Count | Frequency (%) |
| 6000002 | 134 | |
| 6000001 | 120 | |
| 5000000 | 5 | < 0.1% |
| 4000000 | 2 | < 0.1% |
| 3000201 | 2 | < 0.1% |
| 3000200 | 1 | < 0.1% |
| 3000134 | 1 | < 0.1% |
| 3000133 | 3 | < 0.1% |
| 3000132 | 1 | < 0.1% |
| 3000131 | 1 | < 0.1% |
species_original
Text
| Distinct | 2600 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 82 |
|---|---|
| Median length | 73 |
| Mean length | 6.5615346 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3205454 |
|---|---|
| Distinct characters | 51 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 742 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | rat |
|---|---|
| 2nd row | rat |
| 3rd row | mouse |
| 4th row | rat |
| 5th row | rat |
| Value | Count | Frequency (%) |
| rat | 215681 | |
| 61402 | 10.1% | |
| mouse | 61261 | 10.0% |
| rabbit | 25584 | 4.2% |
| daphnia | 16595 | 2.7% |
| magna | 14634 | 2.4% |
| dog | 10286 | 1.7% |
| oncorhynchus | 9990 | 1.6% |
| pimephales | 8155 | 1.3% |
| mykiss | 8076 | 1.3% |
| Other values (2962) | 178691 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 483142 | |
| r | 359835 | |
| t | 320106 | 10.0% |
| s | 223571 | 7.0% |
| e | 194828 | 6.1% |
| i | 189486 | 5.9% |
| o | 182841 | 5.7% |
| u | 154735 | 4.8% |
| m | 151000 | 4.7% |
| 121841 | 3.8% | |
| Other values (41) | 824069 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3015164 | |
| Space Separator | 121841 | 3.8% |
| Dash Punctuation | 62133 | 1.9% |
| Other Punctuation | 4467 | 0.1% |
| Open Punctuation | 816 | < 0.1% |
| Close Punctuation | 785 | < 0.1% |
| Decimal Number | 233 | < 0.1% |
| Math Symbol | 11 | < 0.1% |
| Connector Punctuation | 4 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 483142 | |
| r | 359835 | |
| t | 320106 | |
| s | 223571 | 7.4% |
| e | 194828 | 6.5% |
| i | 189486 | 6.3% |
| o | 182841 | 6.1% |
| u | 154735 | 5.1% |
| m | 151000 | 5.0% |
| n | 112968 | 3.7% |
| Other values (16) | 642652 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 70 | |
| 2 | 37 | |
| 3 | 26 | 11.2% |
| 0 | 23 | 9.9% |
| 4 | 22 | 9.4% |
| 7 | 19 | 8.2% |
| 6 | 19 | 8.2% |
| 5 | 9 | 3.9% |
| 8 | 8 | 3.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 3306 | |
| . | 773 | 17.3% |
| : | 177 | 4.0% |
| / | 116 | 2.6% |
| ; | 58 | 1.3% |
| & | 18 | 0.4% |
| ' | 17 | 0.4% |
| ? | 2 | < 0.1% |
Math Symbol
| Value | Count | Frequency (%) |
| < | 5 | |
| = | 5 | |
| + | 1 | 9.1% |
Space Separator
| Value | Count | Frequency (%) |
| 121841 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 62133 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 816 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 785 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 4 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3015164 | |
| Common | 190290 | 5.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 483142 | |
| r | 359835 | |
| t | 320106 | |
| s | 223571 | 7.4% |
| e | 194828 | 6.5% |
| i | 189486 | 6.3% |
| o | 182841 | 6.1% |
| u | 154735 | 5.1% |
| m | 151000 | 5.0% |
| n | 112968 | 3.7% |
| Other values (16) | 642652 |
Common
| Value | Count | Frequency (%) |
| 121841 | ||
| - | 62133 | |
| , | 3306 | 1.7% |
| ( | 816 | 0.4% |
| ) | 785 | 0.4% |
| . | 773 | 0.4% |
| : | 177 | 0.1% |
| / | 116 | 0.1% |
| 1 | 70 | < 0.1% |
| ; | 58 | < 0.1% |
| Other values (15) | 215 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3205454 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 483142 | |
| r | 359835 | |
| t | 320106 | 10.0% |
| s | 223571 | 7.0% |
| e | 194828 | 6.1% |
| i | 189486 | 5.9% |
| o | 182841 | 5.7% |
| u | 154735 | 4.8% |
| m | 151000 | 4.7% |
| 121841 | 3.8% | |
| Other values (41) | 824069 |
strain
Text
| Distinct | 415 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 120 |
|---|---|
| Median length | 1 |
| Mean length | 5.6886261 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2779019 |
|---|---|
| Distinct characters | 77 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 117 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Sprague-Dawley |
|---|---|
| 2nd row | - |
| 3rd row | Hartley |
| 4th row | - |
| 5th row | Fischer 344 |
| Value | Count | Frequency (%) |
| 274806 | ||
| not | 73650 | 12.2% |
| specified | 73643 | 12.2% |
| sprague-dawley | 57202 | 9.5% |
| fischer | 17770 | 3.0% |
| 344 | 17745 | 2.9% |
| new | 14814 | 2.5% |
| zealand | 14814 | 2.5% |
| crl:cd(sd | 13645 | 2.3% |
| beagle | 8403 | 1.4% |
| Other values (515) | 35412 | 5.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 334148 | 12.0% |
| e | 331999 | 11.9% |
| i | 178117 | 6.4% |
| a | 166661 | 6.0% |
| S | 148426 | 5.3% |
| p | 132498 | 4.8% |
| 113387 | 4.1% | |
| r | 100760 | 3.6% |
| l | 99344 | 3.6% |
| D | 93782 | 3.4% |
| Other values (67) | 1079897 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1770776 | |
| Uppercase Letter | 447158 | 16.1% |
| Dash Punctuation | 334148 | 12.0% |
| Space Separator | 113387 | 4.1% |
| Decimal Number | 68260 | 2.5% |
| Other Punctuation | 17388 | 0.6% |
| Open Punctuation | 13951 | 0.5% |
| Close Punctuation | 13950 | 0.5% |
| Control | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 331999 | |
| i | 178117 | |
| a | 166661 | 9.4% |
| p | 132498 | 7.5% |
| r | 100760 | 5.7% |
| l | 99344 | 5.6% |
| c | 92767 | 5.2% |
| d | 89616 | 5.1% |
| t | 81169 | 4.6% |
| o | 80789 | 4.6% |
| Other values (16) | 417056 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 148426 | |
| D | 93782 | |
| N | 89233 | |
| C | 39611 | 8.9% |
| F | 18679 | 4.2% |
| Z | 14816 | 3.3% |
| B | 11700 | 2.6% |
| A | 4901 | 1.1% |
| V | 4515 | 1.0% |
| W | 4362 | 1.0% |
| Other values (16) | 17133 | 3.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 35522 | |
| 3 | 18954 | |
| 1 | 8663 | 12.7% |
| 5 | 1996 | 2.9% |
| 6 | 941 | 1.4% |
| 7 | 890 | 1.3% |
| 0 | 394 | 0.6% |
| 8 | 337 | 0.5% |
| 9 | 311 | 0.5% |
| 2 | 252 | 0.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 15654 | |
| / | 1549 | 8.9% |
| ' | 94 | 0.5% |
| , | 47 | 0.3% |
| . | 21 | 0.1% |
| ? | 15 | 0.1% |
| ; | 7 | < 0.1% |
| @ | 1 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 13948 | |
| [ | 3 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 13947 | |
| ] | 3 | < 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 334148 |
Space Separator
| Value | Count | Frequency (%) |
| 113387 |
Control
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2217934 | |
| Common | 561085 | 20.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 331999 | |
| i | 178117 | 8.0% |
| a | 166661 | 7.5% |
| S | 148426 | 6.7% |
| p | 132498 | 6.0% |
| r | 100760 | 4.5% |
| l | 99344 | 4.5% |
| D | 93782 | 4.2% |
| c | 92767 | 4.2% |
| d | 89616 | 4.0% |
| Other values (42) | 783964 |
Common
| Value | Count | Frequency (%) |
| - | 334148 | |
| 113387 | 20.2% | |
| 4 | 35522 | 6.3% |
| 3 | 18954 | 3.4% |
| : | 15654 | 2.8% |
| ( | 13948 | 2.5% |
| ) | 13947 | 2.5% |
| 1 | 8663 | 1.5% |
| 5 | 1996 | 0.4% |
| / | 1549 | 0.3% |
| Other values (15) | 3317 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2779019 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 334148 | 12.0% |
| e | 331999 | 11.9% |
| i | 178117 | 6.4% |
| a | 166661 | 6.0% |
| S | 148426 | 5.3% |
| p | 132498 | 4.8% |
| 113387 | 4.1% | |
| r | 100760 | 3.6% |
| l | 99344 | 3.6% |
| D | 93782 | 3.4% |
| Other values (67) | 1079897 |
strain_original
Text
| Distinct | 3388 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 255 |
|---|---|
| Median length | 1 |
| Mean length | 5.5110149 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2692252 |
|---|---|
| Distinct characters | 89 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1400 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | Sprague-Dawley |
|---|---|
| 2nd row | - |
| 3rd row | Hartley |
| 4th row | - |
| 5th row | Fischer 344 |
| Value | Count | Frequency (%) |
| 275273 | ||
| sprague-dawley | 48673 | 8.0% |
| wistar | 40929 | 6.8% |
| fischer | 16896 | 2.8% |
| 344 | 16394 | 2.7% |
| not | 16325 | 2.7% |
| specified | 15770 | 2.6% |
| white | 14934 | 2.5% |
| zealand | 14742 | 2.4% |
| new | 14737 | 2.4% |
| Other values (2334) | 130506 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 332710 | 12.4% |
| e | 255064 | 9.5% |
| a | 217678 | 8.1% |
| r | 150597 | 5.6% |
| i | 118286 | 4.4% |
| 117058 | 4.3% | |
| D | 100917 | 3.7% |
| l | 100340 | 3.7% |
| t | 90058 | 3.3% |
| s | 88344 | 3.3% |
| Other values (79) | 1121200 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1608779 | |
| Uppercase Letter | 450921 | 16.7% |
| Dash Punctuation | 332710 | 12.4% |
| Space Separator | 117058 | 4.3% |
| Decimal Number | 102450 | 3.8% |
| Open Punctuation | 29532 | 1.1% |
| Close Punctuation | 29449 | 1.1% |
| Other Punctuation | 21167 | 0.8% |
| Math Symbol | 166 | < 0.1% |
| Connector Punctuation | 14 | < 0.1% |
| Other values (2) | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 255064 | |
| a | 217678 | |
| r | 150597 | |
| i | 118286 | 7.4% |
| l | 100340 | 6.2% |
| t | 90058 | 5.6% |
| s | 88344 | 5.5% |
| p | 78987 | 4.9% |
| w | 77726 | 4.8% |
| g | 70909 | 4.4% |
| Other values (16) | 360790 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 100917 | |
| S | 80402 | |
| W | 59169 | |
| C | 56563 | |
| F | 33270 | 7.4% |
| B | 25240 | 5.6% |
| N | 18156 | 4.0% |
| R | 17762 | 3.9% |
| Z | 14915 | 3.3% |
| O | 8887 | 2.0% |
| Other values (16) | 35640 | 7.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 13799 | |
| / | 3951 | 18.7% |
| , | 2444 | 11.5% |
| . | 440 | 2.1% |
| ; | 255 | 1.2% |
| ? | 136 | 0.6% |
| " | 72 | 0.3% |
| ' | 22 | 0.1% |
| & | 20 | 0.1% |
| % | 11 | 0.1% |
| Other values (3) | 17 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 36185 | |
| 3 | 30569 | |
| 1 | 18305 | |
| 6 | 11641 | 11.4% |
| 5 | 2183 | 2.1% |
| 7 | 1058 | 1.0% |
| 0 | 990 | 1.0% |
| 2 | 619 | 0.6% |
| 8 | 499 | 0.5% |
| 9 | 401 | 0.4% |
Math Symbol
| Value | Count | Frequency (%) |
| = | 129 | |
| + | 30 | 18.1% |
| ~ | 6 | 3.6% |
| > | 1 | 0.6% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 23250 | |
| [ | 6282 | 21.3% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 23162 | |
| ] | 6287 | 21.3% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 4 | |
| ^ | 1 | 20.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 332710 |
Space Separator
| Value | Count | Frequency (%) |
| 117058 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 14 |
Control
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2059700 | |
| Common | 632552 | 23.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 255064 | 12.4% |
| a | 217678 | 10.6% |
| r | 150597 | 7.3% |
| i | 118286 | 5.7% |
| D | 100917 | 4.9% |
| l | 100340 | 4.9% |
| t | 90058 | 4.4% |
| s | 88344 | 4.3% |
| S | 80402 | 3.9% |
| p | 78987 | 3.8% |
| Other values (42) | 779027 |
Common
| Value | Count | Frequency (%) |
| - | 332710 | |
| 117058 | 18.5% | |
| 4 | 36185 | 5.7% |
| 3 | 30569 | 4.8% |
| ( | 23250 | 3.7% |
| ) | 23162 | 3.7% |
| 1 | 18305 | 2.9% |
| : | 13799 | 2.2% |
| 6 | 11641 | 1.8% |
| ] | 6287 | 1.0% |
| Other values (27) | 19586 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2692252 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 332710 | 12.4% |
| e | 255064 | 9.5% |
| a | 217678 | 8.1% |
| r | 150597 | 5.6% |
| i | 118286 | 4.4% |
| 117058 | 4.3% | |
| D | 100917 | 3.7% |
| l | 100340 | 3.7% |
| t | 90058 | 3.3% |
| s | 88344 | 3.3% |
| Other values (79) | 1121200 |
strain_group
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 45 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| Sprague-Dawley | |
| Cat | |
| Fischer | 17770 |
| Dog | 17104 |
| Other values (40) |
Length
| Max length | 14 |
|---|---|
| Median length | 1 |
| Mean length | 4.3220039 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2111394 |
|---|---|
| Distinct characters | 51 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Sprague-Dawley |
|---|---|
| 2nd row | - |
| 3rd row | Guinea Pig |
| 4th row | - |
| 5th row | Fischer |
Common Values
| Value | Count | Frequency (%) |
| - | 275139 | |
| Sprague-Dawley | 72057 | 14.8% |
| Cat | 56100 | 11.5% |
| Fischer | 17770 | 3.6% |
| Dog | 17104 | 3.5% |
| New Zealand | 14806 | 3.0% |
| Mouse Other | 11441 | 2.3% |
| Beagle | 8402 | 1.7% |
| Not Specified | 4237 | 0.9% |
| Wistar | 3191 | 0.7% |
| Other values (35) | 8275 | 1.7% |
Length
| Value | Count | Frequency (%) |
| 275139 | ||
| sprague-dawley | 72057 | 13.8% |
| cat | 56100 | 10.7% |
| fischer | 17770 | 3.4% |
| dog | 17104 | 3.3% |
| new | 14806 | 2.8% |
| zealand | 14806 | 2.8% |
| other | 13000 | 2.5% |
| mouse | 11454 | 2.2% |
| beagle | 8402 | 1.6% |
| Other values (39) | 21354 | 4.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 347918 | |
| a | 249017 | 11.8% |
| e | 245032 | 11.6% |
| r | 109077 | 5.2% |
| g | 99563 | 4.7% |
| l | 95530 | 4.5% |
| D | 89173 | 4.2% |
| w | 86986 | 4.1% |
| u | 84769 | 4.0% |
| t | 80613 | 3.8% |
| Other values (41) | 623716 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1401721 | |
| Dash Punctuation | 347918 | 16.5% |
| Uppercase Letter | 323296 | 15.3% |
| Space Separator | 33470 | 1.6% |
| Decimal Number | 3698 | 0.2% |
| Other Punctuation | 1291 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 249017 | |
| e | 245032 | |
| r | 109077 | |
| g | 99563 | 7.1% |
| l | 95530 | 6.8% |
| w | 86986 | 6.2% |
| u | 84769 | 6.0% |
| t | 80613 | 5.8% |
| p | 76379 | 5.4% |
| y | 72547 | 5.2% |
| Other values (12) | 202208 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 89173 | |
| S | 76471 | |
| C | 58791 | |
| N | 19043 | 5.9% |
| F | 17982 | 5.6% |
| Z | 14806 | 4.6% |
| O | 12985 | 4.0% |
| M | 11711 | 3.6% |
| B | 11062 | 3.4% |
| W | 3235 | 1.0% |
| Other values (11) | 8037 | 2.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 874 | |
| 6 | 874 | |
| 7 | 739 | |
| 5 | 739 | |
| 3 | 472 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 347918 |
Space Separator
| Value | Count | Frequency (%) |
| 33470 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 1291 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1725017 | |
| Common | 386377 | 18.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 249017 | |
| e | 245032 | |
| r | 109077 | 6.3% |
| g | 99563 | 5.8% |
| l | 95530 | 5.5% |
| D | 89173 | 5.2% |
| w | 86986 | 5.0% |
| u | 84769 | 4.9% |
| t | 80613 | 4.7% |
| S | 76471 | 4.4% |
| Other values (33) | 508786 |
Common
| Value | Count | Frequency (%) |
| - | 347918 | |
| 33470 | 8.7% | |
| / | 1291 | 0.3% |
| 1 | 874 | 0.2% |
| 6 | 874 | 0.2% |
| 7 | 739 | 0.2% |
| 5 | 739 | 0.2% |
| 3 | 472 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2111394 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 347918 | |
| a | 249017 | 11.8% |
| e | 245032 | 11.6% |
| r | 109077 | 5.2% |
| g | 99563 | 4.7% |
| l | 95530 | 4.5% |
| D | 89173 | 4.2% |
| w | 86986 | 4.1% |
| u | 84769 | 4.0% |
| t | 80613 | 3.8% |
| Other values (41) | 623716 |
habitat
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| Terrestrial | 97 |
Length
| Max length | 11 |
|---|---|
| Median length | 1 |
| Mean length | 1.0019856 |
| Min length | 1 |
Characters and Unicode
| Total characters | 489492 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488425 | |
| Terrestrial | 97 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488425 | ||
| terrestrial | 97 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488425 | |
| r | 291 | 0.1% |
| e | 194 | < 0.1% |
| T | 97 | < 0.1% |
| s | 97 | < 0.1% |
| t | 97 | < 0.1% |
| i | 97 | < 0.1% |
| a | 97 | < 0.1% |
| l | 97 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488425 | |
| Lowercase Letter | 970 | 0.2% |
| Uppercase Letter | 97 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 291 | |
| e | 194 | |
| s | 97 | 10.0% |
| t | 97 | 10.0% |
| i | 97 | 10.0% |
| a | 97 | 10.0% |
| l | 97 | 10.0% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488425 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 97 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488425 | |
| Latin | 1067 | 0.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 291 | |
| e | 194 | |
| T | 97 | 9.1% |
| s | 97 | 9.1% |
| t | 97 | 9.1% |
| i | 97 | 9.1% |
| a | 97 | 9.1% |
| l | 97 | 9.1% |
Common
| Value | Count | Frequency (%) |
| - | 488425 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 489492 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488425 | |
| r | 291 | 0.1% |
| e | 194 | < 0.1% |
| T | 97 | < 0.1% |
| s | 97 | < 0.1% |
| t | 97 | < 0.1% |
| i | 97 | < 0.1% |
| a | 97 | < 0.1% |
| l | 97 | < 0.1% |
sex
Categorical
HIGH CORRELATION 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| M/F | |
| F | |
| M | |
| unknown | 28 |
Length
| Max length | 7 |
|---|---|
| Median length | 1 |
| Mean length | 1.3714306 |
| Min length | 1 |
Characters and Unicode
| Total characters | 669974 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | M/F |
| 3rd row | M |
| 4th row | M/F |
| 5th row | M/F |
Common Values
| Value | Count | Frequency (%) |
| - | 275598 | |
| M/F | 90642 | 18.6% |
| F | 66549 | 13.6% |
| M | 55705 | 11.4% |
| unknown | 28 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 275598 | ||
| m/f | 90642 | 18.6% |
| f | 66549 | 13.6% |
| m | 55705 | 11.4% |
| unknown | 28 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 275598 | |
| F | 157191 | |
| M | 146347 | |
| / | 90642 | 13.5% |
| n | 84 | < 0.1% |
| u | 28 | < 0.1% |
| k | 28 | < 0.1% |
| o | 28 | < 0.1% |
| w | 28 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 303538 | |
| Dash Punctuation | 275598 | |
| Other Punctuation | 90642 | 13.5% |
| Lowercase Letter | 196 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 84 | |
| u | 28 | 14.3% |
| k | 28 | 14.3% |
| o | 28 | 14.3% |
| w | 28 | 14.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 157191 | |
| M | 146347 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 275598 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 90642 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 366240 | |
| Latin | 303734 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 157191 | |
| M | 146347 | |
| n | 84 | < 0.1% |
| u | 28 | < 0.1% |
| k | 28 | < 0.1% |
| o | 28 | < 0.1% |
| w | 28 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| - | 275598 | |
| / | 90642 | 24.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 669974 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 275598 | |
| F | 157191 | |
| M | 146347 | |
| / | 90642 | 13.5% |
| n | 84 | < 0.1% |
| u | 28 | < 0.1% |
| k | 28 | < 0.1% |
| o | 28 | < 0.1% |
| w | 28 | < 0.1% |
sex_original
Text
| Distinct | 145 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 36 |
|---|---|
| Median length | 1 |
| Mean length | 3.6818444 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1798662 |
|---|---|
| Distinct characters | 44 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 20 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | male/female |
| 3rd row | male |
| 4th row | male/female |
| 5th row | male/female |
| Value | Count | Frequency (%) |
| 263154 | ||
| male/female | 85217 | 16.9% |
| female | 37639 | 7.5% |
| male | 31642 | 6.3% |
| f | 29891 | 5.9% |
| m | 25015 | 5.0% |
| not | 12442 | 2.5% |
| specified | 12386 | 2.5% |
| mf | 2826 | 0.6% |
| female,male | 910 | 0.2% |
| Other values (97) | 2898 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 390662 | |
| - | 263402 | |
| a | 242466 | |
| l | 241586 | |
| m | 236839 | |
| f | 131486 | 7.3% |
| / | 85366 | 4.7% |
| F | 37812 | 2.1% |
| M | 33392 | 1.9% |
| i | 25180 | 1.4% |
| Other values (34) | 110471 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1359097 | |
| Dash Punctuation | 263402 | 14.6% |
| Other Punctuation | 86675 | 4.8% |
| Uppercase Letter | 71724 | 4.0% |
| Space Separator | 15498 | 0.9% |
| Decimal Number | 2186 | 0.1% |
| Close Punctuation | 40 | < 0.1% |
| Open Punctuation | 40 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 390662 | |
| a | 242466 | |
| l | 241586 | |
| m | 236839 | |
| f | 131486 | 9.7% |
| i | 25180 | 1.9% |
| d | 13705 | 1.0% |
| n | 13678 | 1.0% |
| o | 12965 | 1.0% |
| t | 12558 | 0.9% |
| Other values (10) | 37972 | 2.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 481 | |
| 0 | 477 | |
| 5 | 304 | |
| 6 | 228 | |
| 2 | 216 | |
| 4 | 159 | 7.3% |
| 3 | 132 | 6.0% |
| 8 | 114 | 5.2% |
| 7 | 64 | 2.9% |
| 9 | 11 | 0.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 37812 | |
| M | 33392 | |
| C | 386 | 0.5% |
| N | 79 | 0.1% |
| R | 31 | < 0.1% |
| S | 22 | < 0.1% |
| U | 2 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 85366 | |
| , | 1265 | 1.5% |
| ; | 44 | 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 263402 |
Space Separator
| Value | Count | Frequency (%) |
| 15498 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 40 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 40 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1430821 | |
| Common | 367841 | 20.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 390662 | |
| a | 242466 | |
| l | 241586 | |
| m | 236839 | |
| f | 131486 | 9.2% |
| F | 37812 | 2.6% |
| M | 33392 | 2.3% |
| i | 25180 | 1.8% |
| d | 13705 | 1.0% |
| n | 13678 | 1.0% |
| Other values (17) | 64015 | 4.5% |
Common
| Value | Count | Frequency (%) |
| - | 263402 | |
| / | 85366 | 23.2% |
| 15498 | 4.2% | |
| , | 1265 | 0.3% |
| 1 | 481 | 0.1% |
| 0 | 477 | 0.1% |
| 5 | 304 | 0.1% |
| 6 | 228 | 0.1% |
| 2 | 216 | 0.1% |
| 4 | 159 | < 0.1% |
| Other values (7) | 445 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1798662 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 390662 | |
| - | 263402 | |
| a | 242466 | |
| l | 241586 | |
| m | 236839 | |
| f | 131486 | 7.3% |
| / | 85366 | 4.7% |
| F | 37812 | 2.1% |
| M | 33392 | 1.9% |
| i | 25180 | 1.4% |
| Other values (34) | 110471 | 6.1% |
critical_effect
Text
| Distinct | 23018 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 6094 |
|---|---|
| Median length | 1 |
| Mean length | 32.462522 |
| Min length | 1 |
Characters and Unicode
| Total characters | 15858656 |
|---|---|
| Distinct characters | 92 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9708 ? |
|---|---|
| Unique (%) | 2.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | body weight and weight gain |
| Value | Count | Frequency (%) |
| 283379 | 19.7% | |
| life | 67488 | 4.7% |
| mortality | 59018 | 4.1% |
| weight | 37693 | 2.6% |
| test | 36502 | 2.5% |
| mat | 36346 | 2.5% |
| observation-body | 30234 | 2.1% |
| to | 27192 | 1.9% |
| body | 26845 | 1.9% |
| weight/body | 24992 | 1.7% |
| Other values (17761) | 811205 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 1330701 | 8.4% |
| i | 1299587 | 8.2% |
| e | 1176209 | 7.4% |
| t | 1174115 | 7.4% |
| a | 1001831 | 6.3% |
| 955497 | 6.0% | |
| l | 830039 | 5.2% |
| r | 804629 | 5.1% |
| n | 787787 | 5.0% |
| s | 663041 | 4.2% |
| Other values (82) | 5835220 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13292463 | |
| Space Separator | 955497 | 6.0% |
| Dash Punctuation | 546581 | 3.4% |
| Other Punctuation | 419472 | 2.6% |
| Uppercase Letter | 237404 | 1.5% |
| Math Symbol | 230712 | 1.5% |
| Open Punctuation | 80553 | 0.5% |
| Close Punctuation | 80549 | 0.5% |
| Decimal Number | 10691 | 0.1% |
| Connector Punctuation | 4701 | < 0.1% |
| Other values (2) | 33 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 1330701 | 10.0% |
| i | 1299587 | 9.8% |
| e | 1176209 | 8.8% |
| t | 1174115 | 8.8% |
| a | 1001831 | 7.5% |
| l | 830039 | 6.2% |
| r | 804629 | 6.1% |
| n | 787787 | 5.9% |
| s | 663041 | 5.0% |
| c | 618284 | 4.7% |
| Other values (16) | 3606240 |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 17532 | 7.4% |
| P | 17483 | 7.4% |
| G | 16759 | 7.1% |
| A | 16737 | 7.1% |
| M | 15798 | 6.7% |
| T | 13490 | 5.7% |
| H | 13387 | 5.6% |
| R | 13269 | 5.6% |
| B | 13171 | 5.5% |
| L | 12753 | 5.4% |
| Other values (16) | 87025 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 281112 | |
| : | 55608 | 13.3% |
| . | 45277 | 10.8% |
| , | 29001 | 6.9% |
| ; | 6116 | 1.5% |
| " | 1072 | 0.3% |
| % | 793 | 0.2% |
| ? | 244 | 0.1% |
| ' | 221 | 0.1% |
| * | 17 | < 0.1% |
| Other values (4) | 11 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 2232 | |
| 0 | 1777 | |
| 1 | 1761 | |
| 3 | 1194 | |
| 2 | 1161 | |
| 5 | 1152 | |
| 8 | 566 | 5.3% |
| 6 | 400 | 3.7% |
| 9 | 227 | 2.1% |
| 7 | 221 | 2.1% |
Math Symbol
| Value | Count | Frequency (%) |
| | | 226066 | |
| > | 1910 | 0.8% |
| < | 1668 | 0.7% |
| + | 744 | 0.3% |
| = | 250 | 0.1% |
| ~ | 74 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 73981 | |
| [ | 6572 | 8.2% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 73977 | |
| ] | 6572 | 8.2% |
Control
| Value | Count | Frequency (%) |
| 16 | ||
| 16 |
Space Separator
| Value | Count | Frequency (%) |
| 955497 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 546581 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 4701 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13529867 | |
| Common | 2328789 | 14.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 1330701 | 9.8% |
| i | 1299587 | 9.6% |
| e | 1176209 | 8.7% |
| t | 1174115 | 8.7% |
| a | 1001831 | 7.4% |
| l | 830039 | 6.1% |
| r | 804629 | 5.9% |
| n | 787787 | 5.8% |
| s | 663041 | 4.9% |
| c | 618284 | 4.6% |
| Other values (42) | 3843644 |
Common
| Value | Count | Frequency (%) |
| 955497 | ||
| - | 546581 | |
| / | 281112 | 12.1% |
| | | 226066 | 9.7% |
| ( | 73981 | 3.2% |
| ) | 73977 | 3.2% |
| : | 55608 | 2.4% |
| . | 45277 | 1.9% |
| , | 29001 | 1.2% |
| ] | 6572 | 0.3% |
| Other values (30) | 35117 | 1.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15858656 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 1330701 | 8.4% |
| i | 1299587 | 8.2% |
| e | 1176209 | 7.4% |
| t | 1174115 | 7.4% |
| a | 1001831 | 6.3% |
| 955497 | 6.0% | |
| l | 830039 | 5.2% |
| r | 804629 | 5.1% |
| n | 787787 | 5.0% |
| s | 663041 | 4.2% |
| Other values (82) | 5835220 |
| Distinct | 23100 |
|---|---|
| Distinct (%) | 4.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 6094 |
|---|---|
| Median length | 6013 |
| Mean length | 33.280988 |
| Min length | 1 |
Characters and Unicode
| Total characters | 16258495 |
|---|---|
| Distinct characters | 92 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9766 ? |
|---|---|
| Unique (%) | 2.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | other: |
| 3rd row | other: |
| 4th row | other: |
| 5th row | body weight and weight gain |
| Value | Count | Frequency (%) |
| 232969 | 16.0% | |
| life | 67488 | 4.6% |
| mortality | 59014 | 4.1% |
| other | 39113 | 2.7% |
| weight | 36714 | 2.5% |
| test | 36502 | 2.5% |
| mat | 36346 | 2.5% |
| observation-body | 30234 | 2.1% |
| to | 27194 | 1.9% |
| body | 26845 | 1.8% |
| Other values (17688) | 863143 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 1368084 | 8.4% |
| i | 1286036 | 7.9% |
| t | 1206566 | 7.4% |
| e | 1201182 | 7.4% |
| a | 985089 | 6.1% |
| 970207 | 6.0% | |
| r | 834785 | 5.1% |
| l | 820956 | 5.0% |
| n | 806615 | 5.0% |
| s | 654628 | 4.0% |
| Other values (82) | 6124347 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13362274 | |
| Space Separator | 970207 | 6.0% |
| Uppercase Letter | 550662 | 3.4% |
| Dash Punctuation | 496979 | 3.1% |
| Other Punctuation | 457712 | 2.8% |
| Math Symbol | 242740 | 1.5% |
| Close Punctuation | 81250 | 0.5% |
| Open Punctuation | 81190 | 0.5% |
| Decimal Number | 10747 | 0.1% |
| Connector Punctuation | 4701 | < 0.1% |
| Other values (2) | 33 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 1368084 | 10.2% |
| i | 1286036 | 9.6% |
| t | 1206566 | 9.0% |
| e | 1201182 | 9.0% |
| a | 985089 | 7.4% |
| r | 834785 | 6.2% |
| l | 820956 | 6.1% |
| n | 806615 | 6.0% |
| s | 654628 | 4.9% |
| c | 609464 | 4.6% |
| Other values (16) | 3588869 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 67950 | 12.3% |
| E | 41586 | 7.6% |
| A | 41147 | 7.5% |
| I | 37208 | 6.8% |
| R | 37003 | 6.7% |
| S | 32151 | 5.8% |
| N | 29874 | 5.4% |
| O | 28922 | 5.3% |
| T | 28068 | 5.1% |
| L | 26450 | 4.8% |
| Other values (16) | 180303 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 282431 | |
| : | 92492 | 20.2% |
| . | 45291 | 9.9% |
| , | 29003 | 6.3% |
| ; | 6136 | 1.3% |
| " | 1072 | 0.2% |
| % | 793 | 0.2% |
| ? | 244 | 0.1% |
| ' | 222 | < 0.1% |
| * | 17 | < 0.1% |
| Other values (4) | 11 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 2249 | |
| 0 | 1786 | |
| 1 | 1767 | |
| 3 | 1203 | |
| 2 | 1162 | |
| 5 | 1156 | |
| 8 | 566 | 5.3% |
| 6 | 403 | 3.7% |
| 9 | 233 | 2.2% |
| 7 | 222 | 2.1% |
Math Symbol
| Value | Count | Frequency (%) |
| | | 222030 | |
| > | 9940 | 4.1% |
| < | 9698 | 4.0% |
| + | 744 | 0.3% |
| = | 254 | 0.1% |
| ~ | 74 | < 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 74678 | |
| ] | 6572 | 8.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 74618 | |
| [ | 6572 | 8.1% |
Control
| Value | Count | Frequency (%) |
| 16 | ||
| 16 |
Space Separator
| Value | Count | Frequency (%) |
| 970207 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 496979 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 4701 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ^ | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13912936 | |
| Common | 2345559 | 14.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 1368084 | 9.8% |
| i | 1286036 | 9.2% |
| t | 1206566 | 8.7% |
| e | 1201182 | 8.6% |
| a | 985089 | 7.1% |
| r | 834785 | 6.0% |
| l | 820956 | 5.9% |
| n | 806615 | 5.8% |
| s | 654628 | 4.7% |
| c | 609464 | 4.4% |
| Other values (42) | 4139531 |
Common
| Value | Count | Frequency (%) |
| 970207 | ||
| - | 496979 | |
| / | 282431 | 12.0% |
| | | 222030 | 9.5% |
| : | 92492 | 3.9% |
| ) | 74678 | 3.2% |
| ( | 74618 | 3.2% |
| . | 45291 | 1.9% |
| , | 29003 | 1.2% |
| > | 9940 | 0.4% |
| Other values (30) | 47890 | 2.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16258495 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| o | 1368084 | 8.4% |
| i | 1286036 | 7.9% |
| t | 1206566 | 7.4% |
| e | 1201182 | 7.4% |
| a | 985089 | 6.1% |
| 970207 | 6.0% | |
| r | 834785 | 5.1% |
| l | 820956 | 5.0% |
| n | 806615 | 5.0% |
| s | 654628 | 4.0% |
| Other values (82) | 6124347 |
population
Text
| Distinct | 290 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 51 |
|---|---|
| Median length | 1 |
| Mean length | 1.1734804 |
| Min length | 1 |
Characters and Unicode
| Total characters | 573271 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 58 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 482970 | ||
| female | 2226 | 0.4% |
| male | 1879 | 0.4% |
| rats | 964 | 0.2% |
| rat | 931 | 0.2% |
| f1 | 773 | 0.2% |
| maternal | 685 | 0.1% |
| mice | 565 | 0.1% |
| offspring | 550 | 0.1% |
| p0 | 499 | 0.1% |
| Other values (128) | 5318 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 483556 | |
| a | 10982 | 1.9% |
| e | 10128 | 1.8% |
| 8838 | 1.5% | |
| l | 7238 | 1.3% |
| r | 4086 | 0.7% |
| t | 4051 | 0.7% |
| F | 3962 | 0.7% |
| m | 2782 | 0.5% |
| M | 2770 | 0.5% |
| Other values (59) | 34878 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 483556 | |
| Lowercase Letter | 52927 | 9.2% |
| Uppercase Letter | 18707 | 3.3% |
| Space Separator | 8838 | 1.5% |
| Decimal Number | 2807 | 0.5% |
| Open Punctuation | 2624 | 0.5% |
| Close Punctuation | 2624 | 0.5% |
| Other Punctuation | 1182 | 0.2% |
| Connector Punctuation | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 10982 | |
| e | 10128 | |
| l | 7238 | |
| r | 4086 | 7.7% |
| t | 4051 | 7.7% |
| m | 2782 | 5.3% |
| n | 2202 | 4.2% |
| s | 2143 | 4.0% |
| i | 1888 | 3.6% |
| p | 1240 | 2.3% |
| Other values (16) | 6187 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 3962 | |
| M | 2770 | |
| C | 2061 | |
| D | 2033 | |
| R | 1970 | |
| S | 1418 | 7.6% |
| P | 1065 | 5.7% |
| B | 746 | 4.0% |
| W | 632 | 3.4% |
| O | 600 | 3.2% |
| Other values (13) | 1450 | 7.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 976 | |
| 0 | 899 | |
| 6 | 290 | 10.3% |
| 5 | 233 | 8.3% |
| 7 | 229 | 8.2% |
| 2 | 83 | 3.0% |
| 3 | 68 | 2.4% |
| 9 | 20 | 0.7% |
| 4 | 9 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 816 | |
| / | 351 | |
| . | 12 | 1.0% |
| ? | 3 | 0.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2621 | |
| [ | 3 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2621 | |
| ] | 3 | 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 483556 |
Space Separator
| Value | Count | Frequency (%) |
| 8838 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 6 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 501637 | |
| Latin | 71634 | 12.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 10982 | |
| e | 10128 | |
| l | 7238 | 10.1% |
| r | 4086 | 5.7% |
| t | 4051 | 5.7% |
| F | 3962 | 5.5% |
| m | 2782 | 3.9% |
| M | 2770 | 3.9% |
| n | 2202 | 3.1% |
| s | 2143 | 3.0% |
| Other values (39) | 21290 |
Common
| Value | Count | Frequency (%) |
| - | 483556 | |
| 8838 | 1.8% | |
| ( | 2621 | 0.5% |
| ) | 2621 | 0.5% |
| 1 | 976 | 0.2% |
| 0 | 899 | 0.2% |
| : | 816 | 0.2% |
| / | 351 | 0.1% |
| 6 | 290 | 0.1% |
| 5 | 233 | < 0.1% |
| Other values (10) | 436 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 573271 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 483556 | |
| a | 10982 | 1.9% |
| e | 10128 | 1.8% |
| 8838 | 1.5% | |
| l | 7238 | 1.3% |
| r | 4086 | 0.7% |
| t | 4051 | 0.7% |
| F | 3962 | 0.7% |
| m | 2782 | 0.5% |
| M | 2770 | 0.5% |
| Other values (59) | 34878 | 6.1% |
| Distinct | 290 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 51 |
|---|---|
| Median length | 1 |
| Mean length | 1.1734804 |
| Min length | 1 |
Characters and Unicode
| Total characters | 573271 |
|---|---|
| Distinct characters | 69 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 58 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 482970 | ||
| female | 2226 | 0.4% |
| male | 1879 | 0.4% |
| rats | 964 | 0.2% |
| rat | 931 | 0.2% |
| f1 | 773 | 0.2% |
| maternal | 685 | 0.1% |
| mice | 565 | 0.1% |
| offspring | 550 | 0.1% |
| p0 | 499 | 0.1% |
| Other values (128) | 5318 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 483556 | |
| a | 10982 | 1.9% |
| e | 10128 | 1.8% |
| 8838 | 1.5% | |
| l | 7238 | 1.3% |
| r | 4086 | 0.7% |
| t | 4051 | 0.7% |
| F | 3962 | 0.7% |
| m | 2782 | 0.5% |
| M | 2770 | 0.5% |
| Other values (59) | 34878 | 6.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 483556 | |
| Lowercase Letter | 52927 | 9.2% |
| Uppercase Letter | 18707 | 3.3% |
| Space Separator | 8838 | 1.5% |
| Decimal Number | 2807 | 0.5% |
| Open Punctuation | 2624 | 0.5% |
| Close Punctuation | 2624 | 0.5% |
| Other Punctuation | 1182 | 0.2% |
| Connector Punctuation | 6 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 10982 | |
| e | 10128 | |
| l | 7238 | |
| r | 4086 | 7.7% |
| t | 4051 | 7.7% |
| m | 2782 | 5.3% |
| n | 2202 | 4.2% |
| s | 2143 | 4.0% |
| i | 1888 | 3.6% |
| p | 1240 | 2.3% |
| Other values (16) | 6187 |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 3962 | |
| M | 2770 | |
| C | 2061 | |
| D | 2033 | |
| R | 1970 | |
| S | 1418 | 7.6% |
| P | 1065 | 5.7% |
| B | 746 | 4.0% |
| W | 632 | 3.4% |
| O | 600 | 3.2% |
| Other values (13) | 1450 | 7.8% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 976 | |
| 0 | 899 | |
| 6 | 290 | 10.3% |
| 5 | 233 | 8.3% |
| 7 | 229 | 8.2% |
| 2 | 83 | 3.0% |
| 3 | 68 | 2.4% |
| 9 | 20 | 0.7% |
| 4 | 9 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 816 | |
| / | 351 | |
| . | 12 | 1.0% |
| ? | 3 | 0.3% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2621 | |
| [ | 3 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2621 | |
| ] | 3 | 0.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 483556 |
Space Separator
| Value | Count | Frequency (%) |
| 8838 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 6 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 501637 | |
| Latin | 71634 | 12.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 10982 | |
| e | 10128 | |
| l | 7238 | 10.1% |
| r | 4086 | 5.7% |
| t | 4051 | 5.7% |
| F | 3962 | 5.5% |
| m | 2782 | 3.9% |
| M | 2770 | 3.9% |
| n | 2202 | 3.1% |
| s | 2143 | 3.0% |
| Other values (39) | 21290 |
Common
| Value | Count | Frequency (%) |
| - | 483556 | |
| 8838 | 1.8% | |
| ( | 2621 | 0.5% |
| ) | 2621 | 0.5% |
| 1 | 976 | 0.2% |
| 0 | 899 | 0.2% |
| : | 816 | 0.2% |
| / | 351 | 0.1% |
| 6 | 290 | 0.1% |
| 5 | 233 | < 0.1% |
| Other values (10) | 436 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 573271 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 483556 | |
| a | 10982 | 1.9% |
| e | 10128 | 1.8% |
| 8838 | 1.5% | |
| l | 7238 | 1.3% |
| r | 4086 | 0.7% |
| t | 4051 | 0.7% |
| F | 3962 | 0.7% |
| m | 2782 | 0.5% |
| M | 2770 | 0.5% |
| Other values (59) | 34878 | 6.1% |
exposure_route
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| oral | |
|---|---|
| - | |
| inhalation | |
| dermal | |
| oral, dermal, inhalation | 2937 |
| Other values (26) | 3381 |
Length
| Max length | 29 |
|---|---|
| Median length | 28 |
| Mean length | 4.2739283 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2087908 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | oral |
|---|---|
| 2nd row | oral |
| 3rd row | oral |
| 4th row | oral |
| 5th row | oral |
Common Values
| Value | Count | Frequency (%) |
| oral | 233146 | |
| - | 145752 | |
| inhalation | 69404 | 14.2% |
| dermal | 33902 | 6.9% |
| oral, dermal, inhalation | 2937 | 0.6% |
| oral, inhalation | 1478 | 0.3% |
| subcutaneous | 755 | 0.2% |
| soil | 611 | 0.1% |
| injection | 208 | < 0.1% |
| intraperitoneal | 127 | < 0.1% |
| Other values (21) | 202 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| oral | 237562 | |
| 145752 | ||
| inhalation | 73897 | 14.9% |
| dermal | 36917 | 7.4% |
| subcutaneous | 757 | 0.2% |
| soil | 611 | 0.1% |
| injection | 212 | < 0.1% |
| intraperitoneal | 130 | < 0.1% |
| intravenous | 57 | < 0.1% |
| parental | 11 | < 0.1% |
| Other values (19) | 63 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 423399 | |
| l | 349157 | |
| o | 313264 | |
| r | 274840 | |
| n | 149408 | 7.2% |
| i | 149192 | 7.1% |
| - | 145752 | 7.0% |
| t | 75242 | 3.6% |
| h | 73908 | 3.5% |
| e | 38258 | 1.8% |
| Other values (14) | 95488 | 4.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1927275 | |
| Dash Punctuation | 145752 | 7.0% |
| Space Separator | 7447 | 0.4% |
| Other Punctuation | 7434 | 0.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 423399 | |
| l | 349157 | |
| o | 313264 | |
| r | 274840 | |
| n | 149408 | 7.8% |
| i | 149192 | 7.7% |
| t | 75242 | 3.9% |
| h | 73908 | 3.8% |
| e | 38258 | 2.0% |
| m | 36941 | 1.9% |
| Other values (11) | 43666 | 2.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 145752 |
Space Separator
| Value | Count | Frequency (%) |
| 7447 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 7434 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1927275 | |
| Common | 160633 | 7.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 423399 | |
| l | 349157 | |
| o | 313264 | |
| r | 274840 | |
| n | 149408 | 7.8% |
| i | 149192 | 7.7% |
| t | 75242 | 3.9% |
| h | 73908 | 3.8% |
| e | 38258 | 2.0% |
| m | 36941 | 1.9% |
| Other values (11) | 43666 | 2.3% |
Common
| Value | Count | Frequency (%) |
| - | 145752 | |
| 7447 | 4.6% | |
| , | 7434 | 4.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2087908 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 423399 | |
| l | 349157 | |
| o | 313264 | |
| r | 274840 | |
| n | 149408 | 7.2% |
| i | 149192 | 7.1% |
| - | 145752 | 7.0% |
| t | 75242 | 3.6% |
| h | 73908 | 3.5% |
| e | 38258 | 1.8% |
| Other values (14) | 95488 | 4.6% |
| Distinct | 116 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 194 |
|---|---|
| Median length | 79 |
| Mean length | 4.5925731 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2243573 |
|---|---|
| Distinct characters | 61 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 21 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | oral |
|---|---|
| 2nd row | oral |
| 3rd row | oral |
| 4th row | oral |
| 5th row | oral |
| Value | Count | Frequency (%) |
| oral | 236700 | |
| 137947 | ||
| inhalation | 73826 | 14.3% |
| dermal | 36905 | 7.2% |
| not | 8372 | 1.6% |
| reported | 8372 | 1.6% |
| and | 4431 | 0.9% |
| acute | 956 | 0.2% |
| sub-chronic | 779 | 0.2% |
| other | 713 | 0.1% |
| Other values (136) | 5978 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 428883 | |
| l | 348697 | |
| o | 312841 | |
| r | 294248 | |
| n | 154268 | 6.9% |
| - | 139317 | 6.2% |
| i | 135452 | 6.0% |
| t | 94412 | 4.2% |
| h | 76242 | 3.4% |
| e | 57853 | 2.6% |
| Other values (51) | 201360 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2010424 | |
| Dash Punctuation | 139317 | 6.2% |
| Uppercase Letter | 58520 | 2.6% |
| Space Separator | 26457 | 1.2% |
| Other Punctuation | 8640 | 0.4% |
| Open Punctuation | 102 | < 0.1% |
| Close Punctuation | 102 | < 0.1% |
| Decimal Number | 11 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 428883 | |
| l | 348697 | |
| o | 312841 | |
| r | 294248 | |
| n | 154268 | 7.7% |
| i | 135452 | 6.7% |
| t | 94412 | 4.7% |
| h | 76242 | 3.8% |
| e | 57853 | 2.9% |
| d | 48661 | 2.4% |
| Other values (15) | 58867 | 2.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| O | 19117 | |
| I | 16404 | |
| N | 9027 | |
| E | 3376 | 5.8% |
| D | 2140 | 3.7% |
| T | 1395 | 2.4% |
| S | 1352 | 2.3% |
| R | 1329 | 2.3% |
| U | 957 | 1.6% |
| A | 821 | 1.4% |
| Other values (10) | 2602 | 4.4% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 6683 | |
| . | 1805 | 20.9% |
| : | 113 | 1.3% |
| / | 36 | 0.4% |
| ; | 2 | < 0.1% |
| ' | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 3 | |
| 0 | 2 | |
| 9 | 2 | |
| 5 | 2 | |
| 2 | 1 | 9.1% |
| 8 | 1 | 9.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 139317 |
Space Separator
| Value | Count | Frequency (%) |
| 26457 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 102 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 102 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2068944 | |
| Common | 174629 | 7.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 428883 | |
| l | 348697 | |
| o | 312841 | |
| r | 294248 | |
| n | 154268 | 7.5% |
| i | 135452 | 6.5% |
| t | 94412 | 4.6% |
| h | 76242 | 3.7% |
| e | 57853 | 2.8% |
| d | 48661 | 2.4% |
| Other values (35) | 117387 | 5.7% |
Common
| Value | Count | Frequency (%) |
| - | 139317 | |
| 26457 | 15.2% | |
| , | 6683 | 3.8% |
| . | 1805 | 1.0% |
| : | 113 | 0.1% |
| ( | 102 | 0.1% |
| ) | 102 | 0.1% |
| / | 36 | < 0.1% |
| 1 | 3 | < 0.1% |
| 0 | 2 | < 0.1% |
| Other values (6) | 9 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2243573 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 428883 | |
| l | 348697 | |
| o | 312841 | |
| r | 294248 | |
| n | 154268 | 6.9% |
| - | 139317 | 6.2% |
| i | 135452 | 6.0% |
| t | 94412 | 4.2% |
| h | 76242 | 3.4% |
| e | 57853 | 2.6% |
| Other values (51) | 201360 |
exposure_method
Text
| Distinct | 65 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 138 |
|---|---|
| Median length | 1 |
| Mean length | 2.7877885 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1361896 |
|---|---|
| Distinct characters | 34 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 12 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | gavage |
|---|---|
| 2nd row | gavage |
| 3rd row | - |
| 4th row | gavage |
| 5th row | feed |
| Value | Count | Frequency (%) |
| 297449 | ||
| gavage | 91435 | 18.5% |
| feed | 57235 | 11.6% |
| vapor | 17772 | 3.6% |
| aerosol | 7155 | 1.4% |
| water | 6476 | 1.3% |
| drinking | 4836 | 1.0% |
| capsule | 2449 | 0.5% |
| dust | 2411 | 0.5% |
| gas | 1953 | 0.4% |
| Other values (70) | 5984 | 1.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 297443 | |
| e | 225790 | |
| a | 220278 | |
| g | 190338 | |
| v | 109297 | 8.0% |
| d | 68208 | 5.0% |
| f | 57411 | 4.2% |
| r | 37238 | 2.7% |
| o | 36047 | 2.6% |
| p | 21851 | 1.6% |
| Other values (24) | 97995 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1057761 | |
| Dash Punctuation | 297443 | 21.8% |
| Space Separator | 6633 | 0.5% |
| Other Punctuation | 45 | < 0.1% |
| Uppercase Letter | 12 | < 0.1% |
| Open Punctuation | 1 | < 0.1% |
| Close Punctuation | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 225790 | |
| a | 220278 | |
| g | 190338 | |
| v | 109297 | |
| d | 68208 | 6.4% |
| f | 57411 | 5.4% |
| r | 37238 | 3.5% |
| o | 36047 | 3.4% |
| p | 21851 | 2.1% |
| s | 15401 | 1.5% |
| Other values (15) | 75902 | 7.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 10 | |
| D | 1 | 8.3% |
| W | 1 | 8.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 23 | |
| , | 22 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 297443 |
Space Separator
| Value | Count | Frequency (%) |
| 6633 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 1 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1057773 | |
| Common | 304123 | 22.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 225790 | |
| a | 220278 | |
| g | 190338 | |
| v | 109297 | |
| d | 68208 | 6.4% |
| f | 57411 | 5.4% |
| r | 37238 | 3.5% |
| o | 36047 | 3.4% |
| p | 21851 | 2.1% |
| s | 15401 | 1.5% |
| Other values (18) | 75914 | 7.2% |
Common
| Value | Count | Frequency (%) |
| - | 297443 | |
| 6633 | 2.2% | |
| / | 23 | < 0.1% |
| , | 22 | < 0.1% |
| ( | 1 | < 0.1% |
| ) | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1361896 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 297443 | |
| e | 225790 | |
| a | 220278 | |
| g | 190338 | |
| v | 109297 | 8.0% |
| d | 68208 | 5.0% |
| f | 57411 | 4.2% |
| r | 37238 | 2.7% |
| o | 36047 | 2.6% |
| p | 21851 | 1.6% |
| Other values (24) | 97995 | 7.2% |
| Distinct | 127 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 255 |
|---|---|
| Median length | 1 |
| Mean length | 3.375023 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1648773 |
|---|---|
| Distinct characters | 68 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 18 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | gavage |
|---|---|
| 2nd row | gavage |
| 3rd row | unspecified |
| 4th row | gavage |
| 5th row | feed |
| Value | Count | Frequency (%) |
| 284740 | ||
| gavage | 77801 | 15.6% |
| feed | 56102 | 11.3% |
| vapour | 16252 | 3.3% |
| gavage/intubation | 13090 | 2.6% |
| unspecified | 10875 | 2.2% |
| aerosol | 7155 | 1.4% |
| water | 6501 | 1.3% |
| drinking | 5002 | 1.0% |
| capsule | 2449 | 0.5% |
| Other values (162) | 17183 | 3.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 285501 | |
| e | 245875 | |
| a | 233773 | |
| g | 184660 | |
| v | 108317 | 6.6% |
| d | 77627 | 4.7% |
| f | 67613 | 4.1% |
| i | 63150 | 3.8% |
| o | 52678 | 3.2% |
| n | 50519 | 3.1% |
| Other values (58) | 279060 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1325059 | |
| Dash Punctuation | 285501 | 17.3% |
| Other Punctuation | 13286 | 0.8% |
| Uppercase Letter | 12821 | 0.8% |
| Space Separator | 8628 | 0.5% |
| Open Punctuation | 1682 | 0.1% |
| Close Punctuation | 1682 | 0.1% |
| Decimal Number | 106 | < 0.1% |
| Math Symbol | 8 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 245875 | |
| a | 233773 | |
| g | 184660 | |
| v | 108317 | |
| d | 77627 | 5.9% |
| f | 67613 | 5.1% |
| i | 63150 | 4.8% |
| o | 52678 | 4.0% |
| n | 50519 | 3.8% |
| u | 45797 | 3.5% |
| Other values (16) | 195050 |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 4722 | |
| I | 1557 | 12.1% |
| F | 1528 | 11.9% |
| D | 1294 | 10.1% |
| V | 941 | 7.3% |
| O | 805 | 6.3% |
| N | 741 | 5.8% |
| W | 689 | 5.4% |
| A | 160 | 1.2% |
| U | 115 | 0.9% |
| Other values (8) | 269 | 2.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 25 | |
| 1 | 18 | |
| 2 | 15 | |
| 9 | 14 | |
| 8 | 14 | |
| 3 | 8 | 7.5% |
| 4 | 6 | 5.7% |
| 5 | 4 | 3.8% |
| 6 | 2 | 1.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 13117 | |
| : | 117 | 0.9% |
| , | 25 | 0.2% |
| . | 18 | 0.1% |
| % | 3 | < 0.1% |
| ; | 3 | < 0.1% |
| ' | 2 | < 0.1% |
| * | 1 | < 0.1% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 873 | |
| [ | 809 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 873 | |
| ] | 809 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 285501 |
Space Separator
| Value | Count | Frequency (%) |
| 8628 |
Math Symbol
| Value | Count | Frequency (%) |
| = | 8 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1337880 | |
| Common | 310893 | 18.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 245875 | |
| a | 233773 | |
| g | 184660 | |
| v | 108317 | |
| d | 77627 | 5.8% |
| f | 67613 | 5.1% |
| i | 63150 | 4.7% |
| o | 52678 | 3.9% |
| n | 50519 | 3.8% |
| u | 45797 | 3.4% |
| Other values (34) | 207871 |
Common
| Value | Count | Frequency (%) |
| - | 285501 | |
| / | 13117 | 4.2% |
| 8628 | 2.8% | |
| ( | 873 | 0.3% |
| ) | 873 | 0.3% |
| [ | 809 | 0.3% |
| ] | 809 | 0.3% |
| : | 117 | < 0.1% |
| 0 | 25 | < 0.1% |
| , | 25 | < 0.1% |
| Other values (14) | 116 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1648773 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 285501 | |
| e | 245875 | |
| a | 233773 | |
| g | 184660 | |
| v | 108317 | 6.6% |
| d | 77627 | 4.7% |
| f | 67613 | 4.1% |
| i | 63150 | 3.8% |
| o | 52678 | 3.2% |
| n | 50519 | 3.1% |
| Other values (58) | 279060 |
exposure_form
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| food | 38 |
| water | 24 |
| food and water | 22 |
| nose-only | 6 |
| Other values (5) | 10 |
Length
| Max length | 22 |
|---|---|
| Median length | 1 |
| Mean length | 1.0012978 |
| Min length | 1 |
Characters and Unicode
| Total characters | 489156 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488422 | |
| food | 38 | < 0.1% |
| water | 24 | < 0.1% |
| food and water | 22 | < 0.1% |
| nose-only | 6 | < 0.1% |
| supplement | 5 | < 0.1% |
| juice | 2 | < 0.1% |
| formula | 1 | < 0.1% |
| formula or breast milk | 1 | < 0.1% |
| breast milk | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488422 | ||
| food | 60 | < 0.1% |
| water | 46 | < 0.1% |
| and | 22 | < 0.1% |
| nose-only | 6 | < 0.1% |
| supplement | 5 | < 0.1% |
| juice | 2 | < 0.1% |
| formula | 2 | < 0.1% |
| breast | 2 | < 0.1% |
| milk | 2 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488428 | |
| o | 135 | < 0.1% |
| d | 82 | < 0.1% |
| a | 72 | < 0.1% |
| e | 66 | < 0.1% |
| f | 62 | < 0.1% |
| t | 53 | < 0.1% |
| r | 51 | < 0.1% |
| 48 | < 0.1% | |
| w | 46 | < 0.1% |
| Other values (12) | 113 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488428 | |
| Lowercase Letter | 680 | 0.1% |
| Space Separator | 48 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 135 | |
| d | 82 | |
| a | 72 | |
| e | 66 | |
| f | 62 | |
| t | 53 | 7.8% |
| r | 51 | 7.5% |
| w | 46 | 6.8% |
| n | 39 | 5.7% |
| l | 15 | 2.2% |
| Other values (10) | 59 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488428 |
Space Separator
| Value | Count | Frequency (%) |
| 48 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488476 | |
| Latin | 680 | 0.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| o | 135 | |
| d | 82 | |
| a | 72 | |
| e | 66 | |
| f | 62 | |
| t | 53 | 7.8% |
| r | 51 | 7.5% |
| w | 46 | 6.8% |
| n | 39 | 5.7% |
| l | 15 | 2.2% |
| Other values (10) | 59 |
Common
| Value | Count | Frequency (%) |
| - | 488428 | |
| 48 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 489156 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488428 | |
| o | 135 | < 0.1% |
| d | 82 | < 0.1% |
| a | 72 | < 0.1% |
| e | 66 | < 0.1% |
| f | 62 | < 0.1% |
| t | 53 | < 0.1% |
| r | 51 | < 0.1% |
| 48 | < 0.1% | |
| w | 46 | < 0.1% |
| Other values (12) | 113 | < 0.1% |
exposure_form_original
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| Feed | 38 |
| Daily ingestion of food and water | 22 |
| Daily ingestion of drinking water | 10 |
| ACUTE EXPOSURE (Exposure was nose-only.) | 6 |
| Other values (11) | 24 |
Length
| Max length | 120 |
|---|---|
| Median length | 1 |
| Mean length | 1.0049455 |
| Min length | 1 |
Characters and Unicode
| Total characters | 490938 |
|---|---|
| Distinct characters | 45 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488422 | |
| Feed | 38 | < 0.1% |
| Daily ingestion of food and water | 22 | < 0.1% |
| Daily ingestion of drinking water | 10 | < 0.1% |
| ACUTE EXPOSURE (Exposure was nose-only.) | 6 | < 0.1% |
| Daily ingestion of supplement | 5 | < 0.1% |
| Weekly dose administered in water | 4 | < 0.1% |
| Weekly bolus dose of de-ionized water | 2 | < 0.1% |
| Single dose in bottled spring water | 2 | < 0.1% |
| Daily ingestion of tap water | 2 | < 0.1% |
| Other values (6) | 9 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| 488422 | ||
| of | 50 | < 0.1% |
| daily | 46 | < 0.1% |
| ingestion | 46 | < 0.1% |
| water | 46 | < 0.1% |
| feed | 38 | < 0.1% |
| and | 24 | < 0.1% |
| food | 22 | < 0.1% |
| exposure | 16 | < 0.1% |
| drinking | 10 | < 0.1% |
| Other values (32) | 122 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488432 | |
| 320 | 0.1% | |
| e | 280 | 0.1% |
| o | 211 | < 0.1% |
| i | 200 | < 0.1% |
| n | 187 | < 0.1% |
| a | 154 | < 0.1% |
| t | 133 | < 0.1% |
| d | 132 | < 0.1% |
| s | 97 | < 0.1% |
| Other values (35) | 792 | 0.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488432 | |
| Lowercase Letter | 1982 | 0.4% |
| Space Separator | 320 | 0.1% |
| Uppercase Letter | 178 | < 0.1% |
| Other Punctuation | 12 | < 0.1% |
| Close Punctuation | 6 | < 0.1% |
| Open Punctuation | 6 | < 0.1% |
| Decimal Number | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 280 | |
| o | 211 | |
| i | 200 | |
| n | 187 | |
| a | 154 | 7.8% |
| t | 133 | 6.7% |
| d | 132 | 6.7% |
| s | 97 | 4.9% |
| l | 91 | 4.6% |
| r | 89 | 4.5% |
| Other values (15) | 408 |
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 46 | |
| F | 38 | |
| E | 24 | |
| U | 12 | 6.7% |
| W | 8 | 4.5% |
| S | 8 | 4.5% |
| T | 6 | 3.4% |
| R | 6 | 3.4% |
| O | 6 | 3.4% |
| P | 6 | 3.4% |
| Other values (3) | 18 | 10.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 6 | |
| . | 6 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488432 |
Space Separator
| Value | Count | Frequency (%) |
| 320 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 6 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 6 |
Decimal Number
| Value | Count | Frequency (%) |
| 9 | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488778 | |
| Latin | 2160 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 280 | |
| o | 211 | 9.8% |
| i | 200 | 9.3% |
| n | 187 | 8.7% |
| a | 154 | 7.1% |
| t | 133 | 6.2% |
| d | 132 | 6.1% |
| s | 97 | 4.5% |
| l | 91 | 4.2% |
| r | 89 | 4.1% |
| Other values (28) | 586 |
Common
| Value | Count | Frequency (%) |
| - | 488432 | |
| 320 | 0.1% | |
| , | 6 | < 0.1% |
| ) | 6 | < 0.1% |
| . | 6 | < 0.1% |
| ( | 6 | < 0.1% |
| 9 | 2 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 490938 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488432 | |
| 320 | 0.1% | |
| e | 280 | 0.1% |
| o | 211 | < 0.1% |
| i | 200 | < 0.1% |
| n | 187 | < 0.1% |
| a | 154 | < 0.1% |
| t | 133 | < 0.1% |
| d | 132 | < 0.1% |
| s | 97 | < 0.1% |
| Other values (35) | 792 | 0.2% |
media
Text
| Distinct | 75 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 48 |
|---|---|
| Median length | 1 |
| Mean length | 1.1806306 |
| Min length | 1 |
Characters and Unicode
| Total characters | 576764 |
|---|---|
| Distinct characters | 47 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 7 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 478212 | ||
| water | 6397 | 1.3% |
| fresh | 4457 | 0.9% |
| soil | 1244 | 0.3% |
| oil | 545 | 0.1% |
| air | 446 | 0.1% |
| sediment | 384 | 0.1% |
| corn | 375 | 0.1% |
| aqueous | 342 | 0.1% |
| methylcellulose | 340 | 0.1% |
| Other values (86) | 2560 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 478327 | |
| e | 15528 | 2.7% |
| r | 12171 | 2.1% |
| t | 8268 | 1.4% |
| a | 8111 | 1.4% |
| s | 7345 | 1.3% |
| w | 6917 | 1.2% |
| 6780 | 1.2% | |
| h | 5094 | 0.9% |
| f | 4995 | 0.9% |
| Other values (37) | 23228 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 478327 | |
| Lowercase Letter | 90539 | 15.7% |
| Space Separator | 6780 | 1.2% |
| Decimal Number | 625 | 0.1% |
| Other Punctuation | 355 | 0.1% |
| Uppercase Letter | 138 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 15528 | |
| r | 12171 | |
| t | 8268 | |
| a | 8111 | |
| s | 7345 | |
| w | 6917 | |
| h | 5094 | 5.6% |
| f | 4995 | 5.5% |
| o | 4402 | 4.9% |
| l | 4371 | 4.8% |
| Other values (16) | 13337 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 291 | |
| 2 | 161 | |
| 8 | 110 | 17.6% |
| 5 | 37 | 5.9% |
| 7 | 14 | 2.2% |
| 1 | 6 | 1.0% |
| 3 | 6 | 1.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 75 | |
| P | 17 | 12.3% |
| G | 17 | 12.3% |
| S | 17 | 12.3% |
| A | 6 | 4.3% |
| M | 3 | 2.2% |
| Q | 3 | 2.2% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 168 | |
| % | 81 | |
| . | 77 | |
| , | 27 | 7.6% |
| ; | 2 | 0.6% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 478327 |
Space Separator
| Value | Count | Frequency (%) |
| 6780 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 486087 | |
| Latin | 90677 | 15.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 15528 | |
| r | 12171 | |
| t | 8268 | |
| a | 8111 | |
| s | 7345 | |
| w | 6917 | |
| h | 5094 | 5.6% |
| f | 4995 | 5.5% |
| o | 4402 | 4.9% |
| l | 4371 | 4.8% |
| Other values (23) | 13475 |
Common
| Value | Count | Frequency (%) |
| - | 478327 | |
| 6780 | 1.4% | |
| 0 | 291 | 0.1% |
| / | 168 | < 0.1% |
| 2 | 161 | < 0.1% |
| 8 | 110 | < 0.1% |
| % | 81 | < 0.1% |
| . | 77 | < 0.1% |
| 5 | 37 | < 0.1% |
| , | 27 | < 0.1% |
| Other values (4) | 28 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 576764 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 478327 | |
| e | 15528 | 2.7% |
| r | 12171 | 2.1% |
| t | 8268 | 1.4% |
| a | 8111 | 1.4% |
| s | 7345 | 1.3% |
| w | 6917 | 1.2% |
| 6780 | 1.2% | |
| h | 5094 | 0.9% |
| f | 4995 | 0.9% |
| Other values (37) | 23228 | 4.0% |
media_original
Text
| Distinct | 118 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 34 |
| Missing (%) | < 0.1% |
| Memory size | 3.7 MiB |
Length
| Max length | 58 |
|---|---|
| Median length | 1 |
| Mean length | 1.2107278 |
| Min length | 1 |
Characters and Unicode
| Total characters | 591426 |
|---|---|
| Distinct characters | 65 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 12 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| 476867 | ||
| freshwater | 4512 | 0.9% |
| water | 1954 | 0.4% |
| soil | 1244 | 0.3% |
| none | 1059 | 0.2% |
| oil | 549 | 0.1% |
| air | 446 | 0.1% |
| sediment | 384 | 0.1% |
| aqueous | 381 | 0.1% |
| corn | 375 | 0.1% |
| Other values (138) | 5231 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 477187 | |
| e | 16404 | 2.8% |
| r | 11952 | 2.0% |
| t | 7766 | 1.3% |
| a | 7506 | 1.3% |
| s | 6128 | 1.0% |
| w | 5880 | 1.0% |
| h | 5399 | 0.9% |
| o | 5151 | 0.9% |
| f | 5074 | 0.9% |
| Other values (55) | 42979 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 477187 | |
| Lowercase Letter | 91360 | 15.4% |
| Uppercase Letter | 14925 | 2.5% |
| Space Separator | 4514 | 0.8% |
| Decimal Number | 1715 | 0.3% |
| Other Punctuation | 1354 | 0.2% |
| Open Punctuation | 185 | < 0.1% |
| Close Punctuation | 185 | < 0.1% |
| Math Symbol | 1 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 16404 | |
| r | 11952 | |
| t | 7766 | |
| a | 7506 | |
| s | 6128 | 6.7% |
| w | 5880 | 6.4% |
| h | 5399 | 5.9% |
| o | 5151 | 5.6% |
| f | 5074 | 5.6% |
| n | 4546 | 5.0% |
| Other values (16) | 15554 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 1867 | |
| T | 1750 | |
| S | 1746 | |
| I | 1644 | |
| O | 1371 | |
| L | 1245 | |
| A | 1176 | |
| R | 1175 | |
| W | 1116 | |
| N | 551 | 3.7% |
| Other values (9) | 1284 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 802 | |
| 5 | 395 | |
| 2 | 196 | 11.4% |
| 8 | 144 | 8.4% |
| 1 | 118 | 6.9% |
| 7 | 34 | 2.0% |
| 6 | 14 | 0.8% |
| 4 | 6 | 0.3% |
| 3 | 6 | 0.3% |
Other Punctuation
| Value | Count | Frequency (%) |
| % | 554 | |
| . | 528 | |
| / | 78 | 5.8% |
| , | 76 | 5.6% |
| ; | 74 | 5.5% |
| : | 44 | 3.2% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 477187 |
Space Separator
| Value | Count | Frequency (%) |
| 4514 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 185 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 185 |
Math Symbol
| Value | Count | Frequency (%) |
| < | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 485141 | |
| Latin | 106285 | 18.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 16404 | |
| r | 11952 | 11.2% |
| t | 7766 | 7.3% |
| a | 7506 | 7.1% |
| s | 6128 | 5.8% |
| w | 5880 | 5.5% |
| h | 5399 | 5.1% |
| o | 5151 | 4.8% |
| f | 5074 | 4.8% |
| n | 4546 | 4.3% |
| Other values (35) | 30479 |
Common
| Value | Count | Frequency (%) |
| - | 477187 | |
| 4514 | 0.9% | |
| 0 | 802 | 0.2% |
| % | 554 | 0.1% |
| . | 528 | 0.1% |
| 5 | 395 | 0.1% |
| 2 | 196 | < 0.1% |
| ( | 185 | < 0.1% |
| ) | 185 | < 0.1% |
| 8 | 144 | < 0.1% |
| Other values (10) | 451 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 591426 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 477187 | |
| e | 16404 | 2.8% |
| r | 11952 | 2.0% |
| t | 7766 | 1.3% |
| a | 7506 | 1.3% |
| s | 6128 | 1.0% |
| w | 5880 | 1.0% |
| h | 5399 | 0.9% |
| o | 5151 | 0.9% |
| f | 5074 | 0.9% |
| Other values (55) | 42979 | 7.3% |
lifestage
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| adult | |
| juvenile | 4823 |
| adult-pregnancy | 3341 |
| fetal | 2481 |
| Other values (6) | 20 |
Length
| Max length | 33 |
|---|---|
| Median length | 1 |
| Mean length | 1.5616472 |
| Min length | 1 |
Characters and Unicode
| Total characters | 762899 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 431943 | |
| adult | 45914 | 9.4% |
| juvenile | 4823 | 1.0% |
| adult-pregnancy | 3341 | 0.7% |
| fetal | 2481 | 0.5% |
| child | 6 | < 0.1% |
| adolescent | 4 | < 0.1% |
| adult-lactation | 4 | < 0.1% |
| adult woman, pregant or lactating | 3 | < 0.1% |
| adults and youths | 2 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| 431943 | ||
| adult | 45917 | 9.4% |
| juvenile | 4823 | 1.0% |
| adult-pregnancy | 3341 | 0.7% |
| fetal | 2481 | 0.5% |
| child | 6 | < 0.1% |
| adolescent | 4 | < 0.1% |
| adult-lactation | 4 | < 0.1% |
| woman | 3 | < 0.1% |
| pregant | 3 | < 0.1% |
| Other values (6) | 15 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 435288 | |
| l | 56587 | 7.4% |
| a | 55114 | 7.2% |
| u | 54090 | 7.1% |
| t | 51769 | 6.8% |
| d | 49279 | 6.5% |
| e | 15480 | 2.0% |
| n | 11526 | 1.5% |
| i | 4837 | 0.6% |
| j | 4823 | 0.6% |
| Other values (14) | 24106 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 435288 | |
| Lowercase Letter | 327590 | |
| Space Separator | 18 | < 0.1% |
| Other Punctuation | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 56587 | |
| a | 55114 | |
| u | 54090 | |
| t | 51769 | |
| d | 49279 | |
| e | 15480 | 4.7% |
| n | 11526 | 3.5% |
| i | 4837 | 1.5% |
| j | 4823 | 1.5% |
| v | 4823 | 1.5% |
| Other values (11) | 19262 | 5.9% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 435288 |
Space Separator
| Value | Count | Frequency (%) |
| 18 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 435309 | |
| Latin | 327590 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 56587 | |
| a | 55114 | |
| u | 54090 | |
| t | 51769 | |
| d | 49279 | |
| e | 15480 | 4.7% |
| n | 11526 | 3.5% |
| i | 4837 | 1.5% |
| j | 4823 | 1.5% |
| v | 4823 | 1.5% |
| Other values (11) | 19262 | 5.9% |
Common
| Value | Count | Frequency (%) |
| - | 435288 | |
| 18 | < 0.1% | |
| , | 3 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 762899 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 435288 | |
| l | 56587 | 7.4% |
| a | 55114 | 7.2% |
| u | 54090 | 7.1% |
| t | 51769 | 6.8% |
| d | 49279 | 6.5% |
| e | 15480 | 2.0% |
| n | 11526 | 1.5% |
| i | 4837 | 0.6% |
| j | 4823 | 0.6% |
| Other values (14) | 24106 | 3.2% |
lifestage_original
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| adult | |
| juvenile | 4823 |
| adult-pregnancy | 3341 |
| fetal | 2481 |
| Other values (8) | 98 |
Length
| Max length | 33 |
|---|---|
| Median length | 1 |
| Mean length | 1.561684 |
| Min length | 1 |
Characters and Unicode
| Total characters | 762917 |
|---|---|
| Distinct characters | 28 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 431943 | |
| adult | 45836 | 9.4% |
| juvenile | 4823 | 1.0% |
| adult-pregnancy | 3341 | 0.7% |
| fetal | 2481 | 0.5% |
| Adult | 74 | < 0.1% |
| Children | 6 | < 0.1% |
| Infant | 4 | < 0.1% |
| Adolescent | 4 | < 0.1% |
| adult-lactation | 4 | < 0.1% |
| Other values (3) | 6 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| 431943 | ||
| adult | 45915 | 9.4% |
| juvenile | 4823 | 1.0% |
| adult-pregnancy | 3341 | 0.7% |
| fetal | 2481 | 0.5% |
| children | 7 | < 0.1% |
| infant | 4 | < 0.1% |
| adolescent | 4 | < 0.1% |
| adult-lactation | 4 | < 0.1% |
| woman | 3 | < 0.1% |
| Other values (6) | 15 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 435288 | |
| l | 56583 | 7.4% |
| a | 55032 | 7.2% |
| u | 54086 | 7.1% |
| t | 51767 | 6.8% |
| d | 49275 | 6.5% |
| e | 15486 | 2.0% |
| n | 11542 | 1.5% |
| i | 4837 | 0.6% |
| j | 4823 | 0.6% |
| Other values (18) | 24198 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 435288 | |
| Lowercase Letter | 327514 | |
| Uppercase Letter | 94 | < 0.1% |
| Space Separator | 18 | < 0.1% |
| Other Punctuation | 3 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 56583 | |
| a | 55032 | |
| u | 54086 | |
| t | 51767 | |
| d | 49275 | |
| e | 15486 | 4.7% |
| n | 11542 | 3.5% |
| i | 4837 | 1.5% |
| j | 4823 | 1.5% |
| v | 4823 | 1.5% |
| Other values (11) | 19260 | 5.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 82 | |
| C | 6 | 6.4% |
| I | 4 | 4.3% |
| Y | 2 | 2.1% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 435288 |
Space Separator
| Value | Count | Frequency (%) |
| 18 |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 435309 | |
| Latin | 327608 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| l | 56583 | |
| a | 55032 | |
| u | 54086 | |
| t | 51767 | |
| d | 49275 | |
| e | 15486 | 4.7% |
| n | 11542 | 3.5% |
| i | 4837 | 1.5% |
| j | 4823 | 1.5% |
| v | 4823 | 1.5% |
| Other values (15) | 19354 | 5.9% |
Common
| Value | Count | Frequency (%) |
| - | 435288 | |
| 18 | < 0.1% | |
| , | 3 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 762917 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 435288 | |
| l | 56583 | 7.4% |
| a | 55032 | 7.2% |
| u | 54086 | 7.1% |
| t | 51767 | 6.8% |
| d | 49275 | 6.5% |
| e | 15486 | 2.0% |
| n | 11542 | 1.5% |
| i | 4837 | 0.6% |
| j | 4823 | 0.6% |
| Other values (18) | 24198 | 3.2% |
generation
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| F0 | |
| F1 | 6388 |
| Fetal | 2825 |
| F2 | 2388 |
| Other values (3) | 1126 |
Length
| Max length | 7 |
|---|---|
| Median length | 1 |
| Mean length | 1.1376683 |
| Min length | 1 |
Characters and Unicode
| Total characters | 555776 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 430688 | |
| F0 | 45107 | 9.2% |
| F1 | 6388 | 1.3% |
| Fetal | 2825 | 0.6% |
| F2 | 2388 | 0.5% |
| P0 | 722 | 0.1% |
| F3 | 215 | < 0.1% |
| unknown | 189 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 430688 | ||
| f0 | 45107 | 9.2% |
| f1 | 6388 | 1.3% |
| fetal | 2825 | 0.6% |
| f2 | 2388 | 0.5% |
| p0 | 722 | 0.1% |
| f3 | 215 | < 0.1% |
| unknown | 189 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 430688 | |
| F | 56923 | 10.2% |
| 0 | 45829 | 8.2% |
| 1 | 6388 | 1.1% |
| e | 2825 | 0.5% |
| t | 2825 | 0.5% |
| a | 2825 | 0.5% |
| l | 2825 | 0.5% |
| 2 | 2388 | 0.4% |
| P | 722 | 0.1% |
| Other values (6) | 1538 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 430688 | |
| Uppercase Letter | 57645 | 10.4% |
| Decimal Number | 54820 | 9.9% |
| Lowercase Letter | 12623 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2825 | |
| t | 2825 | |
| a | 2825 | |
| l | 2825 | |
| n | 567 | 4.5% |
| u | 189 | 1.5% |
| k | 189 | 1.5% |
| o | 189 | 1.5% |
| w | 189 | 1.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 45829 | |
| 1 | 6388 | 11.7% |
| 2 | 2388 | 4.4% |
| 3 | 215 | 0.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 56923 | |
| P | 722 | 1.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 430688 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 485508 | |
| Latin | 70268 | 12.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 56923 | |
| e | 2825 | 4.0% |
| t | 2825 | 4.0% |
| a | 2825 | 4.0% |
| l | 2825 | 4.0% |
| P | 722 | 1.0% |
| n | 567 | 0.8% |
| u | 189 | 0.3% |
| k | 189 | 0.3% |
| o | 189 | 0.3% |
Common
| Value | Count | Frequency (%) |
| - | 430688 | |
| 0 | 45829 | 9.4% |
| 1 | 6388 | 1.3% |
| 2 | 2388 | 0.5% |
| 3 | 215 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 555776 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 430688 | |
| F | 56923 | 10.2% |
| 0 | 45829 | 8.2% |
| 1 | 6388 | 1.1% |
| e | 2825 | 0.5% |
| t | 2825 | 0.5% |
| a | 2825 | 0.5% |
| l | 2825 | 0.5% |
| 2 | 2388 | 0.4% |
| P | 722 | 0.1% |
| Other values (6) | 1538 | 0.3% |
generation_original
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| F0 | |
| F1 | 6388 |
| Fetal | 2825 |
| F2 | 2388 |
| Other values (3) | 1126 |
Length
| Max length | 7 |
|---|---|
| Median length | 1 |
| Mean length | 1.1376683 |
| Min length | 1 |
Characters and Unicode
| Total characters | 555776 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 430688 | |
| F0 | 45107 | 9.2% |
| F1 | 6388 | 1.3% |
| Fetal | 2825 | 0.6% |
| F2 | 2388 | 0.5% |
| P0 | 722 | 0.1% |
| F3 | 215 | < 0.1% |
| unknown | 189 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 430688 | ||
| f0 | 45107 | 9.2% |
| f1 | 6388 | 1.3% |
| fetal | 2825 | 0.6% |
| f2 | 2388 | 0.5% |
| p0 | 722 | 0.1% |
| f3 | 215 | < 0.1% |
| unknown | 189 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 430688 | |
| F | 56923 | 10.2% |
| 0 | 45829 | 8.2% |
| 1 | 6388 | 1.1% |
| e | 2825 | 0.5% |
| t | 2825 | 0.5% |
| a | 2825 | 0.5% |
| l | 2825 | 0.5% |
| 2 | 2388 | 0.4% |
| P | 722 | 0.1% |
| Other values (6) | 1538 | 0.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 430688 | |
| Uppercase Letter | 57645 | 10.4% |
| Decimal Number | 54820 | 9.9% |
| Lowercase Letter | 12623 | 2.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2825 | |
| t | 2825 | |
| a | 2825 | |
| l | 2825 | |
| n | 567 | 4.5% |
| u | 189 | 1.5% |
| k | 189 | 1.5% |
| o | 189 | 1.5% |
| w | 189 | 1.5% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 45829 | |
| 1 | 6388 | 11.7% |
| 2 | 2388 | 4.4% |
| 3 | 215 | 0.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 56923 | |
| P | 722 | 1.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 430688 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 485508 | |
| Latin | 70268 | 12.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 56923 | |
| e | 2825 | 4.0% |
| t | 2825 | 4.0% |
| a | 2825 | 4.0% |
| l | 2825 | 4.0% |
| P | 722 | 1.0% |
| n | 567 | 0.8% |
| u | 189 | 0.3% |
| k | 189 | 0.3% |
| o | 189 | 0.3% |
Common
| Value | Count | Frequency (%) |
| - | 430688 | |
| 0 | 45829 | 9.4% |
| 1 | 6388 | 1.3% |
| 2 | 2388 | 0.5% |
| 3 | 215 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 555776 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 430688 | |
| F | 56923 | 10.2% |
| 0 | 45829 | 8.2% |
| 1 | 6388 | 1.1% |
| e | 2825 | 0.5% |
| t | 2825 | 0.5% |
| a | 2825 | 0.5% |
| l | 2825 | 0.5% |
| 2 | 2388 | 0.4% |
| P | 722 | 0.1% |
| Other values (6) | 1538 | 0.3% |
year
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
year_original
Unsupported
REJECTED  UNSUPPORTED 
| Missing | 0 |
|---|---|
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
mw
Real number (ℝ)
SKEWED 
| Distinct | 23840 |
|---|---|
| Distinct (%) | 4.9% |
| Missing | 267 |
| Missing (%) | 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 204.23171 |
| Minimum | -1 |
|---|---|
| Maximum | 900000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 111707 |
| Negative (%) | 22.9% |
| Memory size | 3.7 MiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | 58.44 |
| median | 169.872 |
| Q3 | 295.32 |
| 95-th percentile | 474.82 |
| Maximum | 900000 |
| Range | 900001 |
| Interquartile range (IQR) | 236.88 |
Descriptive statistics
| Standard deviation | 2298.4744 |
|---|---|
| Coefficient of variation (CV) | 11.254248 |
| Kurtosis | 144436.89 |
| Mean | 204.23171 |
| Median Absolute Deviation (MAD) | 121.519 |
| Skewness | 371.01917 |
| Sum | 99717153 |
| Variance | 5282984.5 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -1 | 111707 | 22.9% |
| 159.6 | 2736 | 0.6% |
| 183.31 | 1515 | 0.3% |
| 266.32 | 1421 | 0.3% |
| 100.117 | 1230 | 0.3% |
| 364.9 | 1216 | 0.2% |
| 94.113 | 1148 | 0.2% |
| 161.44 | 1099 | 0.2% |
| 201.225 | 1050 | 0.2% |
| 380.9 | 1022 | 0.2% |
| Other values (23830) | 364111 |
| Value | Count | Frequency (%) |
| -1 | 111707 | |
| 2.016 | 4 | < 0.1% |
| 2.02 | 6 | < 0.1% |
| 3.01605 | 4 | < 0.1% |
| 4 | 6 | < 0.1% |
| 4.0015 | 1 | < 0.1% |
| 4.0026 | 3 | < 0.1% |
| 4.0282 | 3 | < 0.1% |
| 4.03 | 6 | < 0.1% |
| 6.94 | 29 | < 0.1% |
| Value | Count | Frequency (%) |
| 900000 | 3 | |
| 150000 | 3 | |
| 70000 | 3 | |
| 64000 | 3 | |
| 62000 | 3 | |
| 60000 | 3 | |
| 57000 | 3 | |
| 50000 | 3 | |
| 12000 | 3 | |
| 10000 | 3 |
datestamp
Date
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| Minimum | 2023-05-17 00:00:00 |
|---|---|
| Maximum | 2023-08-24 00:00:00 |
source_source_id
Unsupported
MISSING  REJECTED  UNSUPPORTED 
| Missing | 488522 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 3.7 MiB |
toxval_uuid
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488522 |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488522 |
toxval_hash
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 488522 |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 488522 |
target_species
Categorical
HIGH CORRELATION 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| Human | |
|---|---|
| - |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 3.7318237 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1823078 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Human |
|---|---|
| 2nd row | Human |
| 3rd row | Human |
| 4th row | Human |
| 5th row | Human |
Common Values
| Value | Count | Frequency (%) |
| Human | 333639 | |
| - | 154883 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| human | 333639 | |
| 154883 |
Most occurring characters
| Value | Count | Frequency (%) |
| H | 333639 | |
| u | 333639 | |
| m | 333639 | |
| a | 333639 | |
| n | 333639 | |
| - | 154883 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1334556 | |
| Uppercase Letter | 333639 | 18.3% |
| Dash Punctuation | 154883 | 8.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 333639 | |
| m | 333639 | |
| a | 333639 | |
| n | 333639 |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 333639 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 154883 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1668195 | |
| Common | 154883 | 8.5% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| H | 333639 | |
| u | 333639 | |
| m | 333639 | |
| a | 333639 | |
| n | 333639 |
Common
| Value | Count | Frequency (%) |
| - | 154883 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1823078 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| H | 333639 | |
| u | 333639 | |
| m | 333639 | |
| a | 333639 | |
| n | 333639 | |
| - | 154883 |
study_group
Text
| Distinct | 90591 |
|---|---|
| Distinct (%) | 18.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
Length
| Max length | 31 |
|---|---|
| Median length | 1 |
| Mean length | 8.1159416 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3964816 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 53261 ? |
|---|---|
| Unique (%) | 10.9% |
Sample
| 1st row | ECHA IUCLID_1172305 |
|---|---|
| 2nd row | ECHA IUCLID_dup_2 |
| 3rd row | ECHA IUCLID_1172307 |
| 4th row | ECHA IUCLID_1172308 |
| 5th row | ECHA IUCLID_1172309 |
| Value | Count | Frequency (%) |
| 299908 | ||
| echa | 167663 | |
| epa | 4909 | 0.7% |
| ow | 4909 | 0.7% |
| rsl_dup_6 | 2548 | 0.4% |
| rsl_dup_5 | 1800 | 0.3% |
| rsl_dup_1 | 1125 | 0.2% |
| rsl_dup_2 | 1108 | 0.2% |
| rsl_dup_3 | 1087 | 0.2% |
| rsl_dup_4 | 947 | 0.1% |
| Other values (90588) | 181568 |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 338306 | 8.5% |
| C | 336692 | 8.5% |
| _ | 323967 | 8.2% |
| - | 304702 | 7.7% |
| L | 186229 | 4.7% |
| 179050 | 4.5% | |
| A | 177600 | 4.5% |
| E | 172790 | 4.4% |
| H | 168787 | 4.3% |
| D | 168455 | 4.2% |
| Other values (36) | 1608238 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1785912 | |
| Decimal Number | 961179 | |
| Lowercase Letter | 410006 | 10.3% |
| Connector Punctuation | 323967 | 8.2% |
| Dash Punctuation | 304702 | 7.7% |
| Space Separator | 179050 | 4.5% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 338306 | |
| C | 336692 | |
| L | 186229 | |
| A | 177600 | |
| E | 172790 | |
| H | 168787 | |
| D | 168455 | |
| U | 167663 | |
| R | 16070 | 0.9% |
| S | 15824 | 0.9% |
| Other values (8) | 37496 | 2.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| d | 135571 | |
| u | 135353 | |
| p | 135353 | |
| s | 677 | 0.2% |
| e | 436 | 0.1% |
| i | 436 | 0.1% |
| l | 436 | 0.1% |
| f | 218 | 0.1% |
| n | 218 | 0.1% |
| c | 218 | 0.1% |
| Other values (5) | 1090 | 0.3% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 156027 | |
| 2 | 127649 | |
| 8 | 93922 | |
| 3 | 89419 | |
| 5 | 89333 | |
| 7 | 89299 | |
| 4 | 86267 | |
| 6 | 85855 | |
| 9 | 73186 | |
| 0 | 70222 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 323967 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 304702 |
Space Separator
| Value | Count | Frequency (%) |
| 179050 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2195918 | |
| Common | 1768898 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| I | 338306 | |
| C | 336692 | |
| L | 186229 | |
| A | 177600 | |
| E | 172790 | |
| H | 168787 | |
| D | 168455 | |
| U | 167663 | |
| d | 135571 | |
| u | 135353 | |
| Other values (23) | 208472 |
Common
| Value | Count | Frequency (%) |
| _ | 323967 | |
| - | 304702 | |
| 179050 | ||
| 1 | 156027 | |
| 2 | 127649 | 7.2% |
| 8 | 93922 | 5.3% |
| 3 | 89419 | 5.1% |
| 5 | 89333 | 5.1% |
| 7 | 89299 | 5.0% |
| 4 | 86267 | 4.9% |
| Other values (3) | 229263 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3964816 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| I | 338306 | 8.5% |
| C | 336692 | 8.5% |
| _ | 323967 | 8.2% |
| - | 304702 | 7.7% |
| L | 186229 | 4.7% |
| 179050 | 4.5% | |
| A | 177600 | 4.5% |
| E | 172790 | 4.4% |
| H | 168787 | 4.3% |
| D | 168455 | 4.2% |
| Other values (36) | 1608238 |
human_ra
Categorical
HIGH CORRELATION  IMBALANCE 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| - | |
|---|---|
| Y | 32052 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
Common Values
| Value | Count | Frequency (%) |
| - | 456470 | |
| Y | 32052 | 6.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 456470 | ||
| y | 32052 | 6.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 456470 | |
| Y | 32052 | 6.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 456470 | |
| Uppercase Letter | 32052 | 6.6% |
Most frequent character per category
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 456470 |
Uppercase Letter
| Value | Count | Frequency (%) |
| Y | 32052 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 456470 | |
| Latin | 32052 | 6.6% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| - | 456470 |
Latin
| Value | Count | Frequency (%) |
| Y | 32052 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 456470 | |
| Y | 32052 | 6.6% |
visible
Categorical
CONSTANT 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.7 MiB |
| 1 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 488522 |
|---|---|
| Distinct characters | 1 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 488522 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 488522 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 488522 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 488522 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 488522 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 488522 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 488522 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 488522 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 488522 |
| toxval_id | toxval_numeric | toxval_numeric_original | study_duration_value | species_id | mw | source | source_url | subsource_url | details_text | priority_id | qc_status | risk_assessment_class | human_eco | toxval_numeric_qualifier | toxval_numeric_qualifier_original | study_type | study_duration_class | study_duration_units | strain_group | habitat | sex | exposure_route | exposure_form | exposure_form_original | lifestage | lifestage_original | generation | generation_original | target_species | human_ra | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| toxval_id | 1.000 | -0.203 | -0.202 | 0.021 | 0.076 | 0.233 | 0.975 | 0.941 | 0.025 | 0.975 | 0.629 | 0.249 | 0.512 | 0.433 | 0.248 | 0.435 | 0.456 | 0.301 | 0.323 | 0.427 | 0.022 | 0.444 | 0.443 | 0.013 | 0.012 | 0.294 | 0.295 | 0.299 | 0.299 | 0.519 | 0.584 |
| toxval_numeric | -0.203 | 1.000 | 0.982 | -0.224 | 0.066 | -0.091 | 0.011 | 0.004 | 0.000 | 0.011 | 0.007 | 0.000 | 0.015 | 0.004 | 0.000 | 0.000 | 0.014 | 0.000 | 0.000 | 0.134 | 0.000 | 0.005 | 0.020 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.006 | 0.010 |
| toxval_numeric_original | -0.202 | 0.982 | 1.000 | -0.227 | 0.067 | -0.085 | 0.001 | 0.005 | 0.000 | 0.001 | 0.008 | 0.000 | 0.017 | 0.000 | 0.000 | 0.000 | 0.015 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.024 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.011 |
| study_duration_value | 0.021 | -0.224 | -0.227 | 1.000 | -0.278 | 0.131 | 0.251 | 0.194 | 0.007 | 0.251 | 0.189 | 0.038 | 0.123 | 0.220 | 0.008 | 0.011 | 0.133 | 0.100 | 0.319 | 0.075 | 0.011 | 0.120 | 0.069 | 0.000 | 0.000 | 0.099 | 0.099 | 0.113 | 0.113 | 0.021 | 0.254 |
| species_id | 0.076 | 0.066 | 0.067 | -0.278 | 1.000 | -0.149 | 0.376 | 0.373 | 0.041 | 0.376 | 0.447 | 0.138 | 0.331 | 0.202 | 0.081 | 0.120 | 0.262 | 0.079 | 0.153 | 0.207 | 0.097 | 0.159 | 0.182 | 0.047 | 0.049 | 0.063 | 0.069 | 0.058 | 0.058 | 0.202 | 0.575 |
| mw | 0.233 | -0.091 | -0.085 | 0.131 | -0.149 | 1.000 | 0.012 | 0.013 | 0.000 | 0.012 | 0.008 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 |
| source | 0.975 | 0.011 | 0.001 | 0.251 | 0.376 | 0.012 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.317 | 0.465 | 0.818 | 0.207 | 0.304 | 0.388 | 0.323 | 0.204 | 0.174 | 1.000 | 0.513 | 0.269 | 0.329 | 0.255 | 0.341 | 0.405 | 0.488 | 0.488 | 0.950 | 1.000 |
| source_url | 0.941 | 0.004 | 0.005 | 0.194 | 0.373 | 0.013 | 1.000 | 1.000 | 0.174 | 1.000 | 0.916 | 0.310 | 0.412 | 0.801 | 0.194 | 0.297 | 0.309 | 0.225 | 0.183 | 0.168 | 0.023 | 0.424 | 0.237 | 0.027 | 0.020 | 0.197 | 0.180 | 0.336 | 0.336 | 0.805 | 0.997 |
| subsource_url | 0.025 | 0.000 | 0.000 | 0.007 | 0.041 | 0.000 | 1.000 | 0.174 | 1.000 | 1.000 | 0.064 | 0.000 | 0.088 | 0.009 | 0.009 | 0.015 | 0.072 | 0.671 | 0.120 | 0.013 | 0.000 | 0.015 | 0.013 | 0.000 | 0.000 | 0.005 | 0.004 | 0.005 | 0.005 | 0.012 | 0.067 |
| details_text | 0.975 | 0.011 | 0.001 | 0.251 | 0.376 | 0.012 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.317 | 0.465 | 0.818 | 0.207 | 0.304 | 0.388 | 0.323 | 0.204 | 0.174 | 1.000 | 0.513 | 0.269 | 0.329 | 0.255 | 0.341 | 0.405 | 0.488 | 0.488 | 0.950 | 1.000 |
| priority_id | 0.629 | 0.007 | 0.008 | 0.189 | 0.447 | 0.008 | 1.000 | 0.916 | 0.064 | 1.000 | 1.000 | 0.232 | 0.578 | 0.371 | 0.150 | 0.289 | 0.466 | 0.518 | 0.406 | 0.322 | 0.008 | 0.321 | 0.400 | 0.009 | 0.009 | 0.497 | 0.497 | 0.498 | 0.498 | 0.361 | 0.810 |
| qc_status | 0.249 | 0.000 | 0.000 | 0.038 | 0.138 | 0.000 | 0.317 | 0.310 | 0.000 | 0.317 | 0.232 | 1.000 | 0.423 | 0.450 | 0.069 | 0.078 | 0.278 | 0.044 | 0.118 | 0.078 | 0.002 | 0.081 | 0.067 | 0.000 | 0.000 | 0.043 | 0.043 | 0.043 | 0.043 | 0.127 | 0.081 |
| risk_assessment_class | 0.512 | 0.015 | 0.017 | 0.123 | 0.331 | 0.000 | 0.465 | 0.412 | 0.088 | 0.465 | 0.578 | 0.423 | 1.000 | 0.540 | 0.144 | 0.153 | 0.937 | 0.131 | 0.222 | 0.173 | 0.037 | 0.337 | 0.241 | 0.007 | 0.005 | 0.255 | 0.233 | 0.348 | 0.348 | 0.486 | 0.628 |
| human_eco | 0.433 | 0.004 | 0.000 | 0.220 | 0.202 | 0.000 | 0.818 | 0.801 | 0.009 | 0.818 | 0.371 | 0.450 | 0.540 | 1.000 | 0.142 | 0.186 | 0.410 | 0.112 | 0.314 | 0.551 | 0.007 | 0.262 | 0.444 | 0.000 | 0.000 | 0.109 | 0.109 | 0.110 | 0.110 | 0.765 | 0.138 |
| toxval_numeric_qualifier | 0.248 | 0.000 | 0.000 | 0.008 | 0.081 | 0.000 | 0.207 | 0.194 | 0.009 | 0.207 | 0.150 | 0.069 | 0.144 | 0.142 | 1.000 | 1.000 | 0.135 | 0.076 | 0.099 | 0.139 | 0.007 | 0.196 | 0.139 | 0.000 | 0.000 | 0.075 | 0.075 | 0.080 | 0.080 | 0.231 | 0.150 |
| toxval_numeric_qualifier_original | 0.435 | 0.000 | 0.000 | 0.011 | 0.120 | 0.000 | 0.304 | 0.297 | 0.015 | 0.304 | 0.289 | 0.078 | 0.153 | 0.186 | 1.000 | 1.000 | 0.144 | 0.152 | 0.126 | 0.134 | 0.026 | 0.223 | 0.161 | 0.007 | 0.005 | 0.164 | 0.150 | 0.195 | 0.195 | 0.238 | 0.206 |
| study_type | 0.456 | 0.014 | 0.015 | 0.133 | 0.262 | 0.000 | 0.388 | 0.309 | 0.072 | 0.388 | 0.466 | 0.278 | 0.937 | 0.410 | 0.135 | 0.144 | 1.000 | 0.141 | 0.239 | 0.188 | 0.021 | 0.332 | 0.225 | 0.008 | 0.006 | 0.257 | 0.234 | 0.350 | 0.350 | 0.449 | 0.660 |
| study_duration_class | 0.301 | 0.000 | 0.000 | 0.100 | 0.079 | 0.000 | 0.323 | 0.225 | 0.671 | 0.323 | 0.518 | 0.044 | 0.131 | 0.112 | 0.076 | 0.152 | 0.141 | 1.000 | 0.151 | 0.110 | 0.602 | 0.294 | 0.066 | 0.250 | 0.215 | 0.338 | 0.366 | 0.413 | 0.413 | 0.255 | 0.320 |
| study_duration_units | 0.323 | 0.000 | 0.000 | 0.319 | 0.153 | 0.000 | 0.204 | 0.183 | 0.120 | 0.204 | 0.406 | 0.118 | 0.222 | 0.314 | 0.099 | 0.126 | 0.239 | 0.151 | 1.000 | 0.103 | 0.016 | 0.317 | 0.157 | 0.005 | 0.008 | 0.300 | 0.274 | 0.420 | 0.420 | 0.325 | 0.271 |
| strain_group | 0.427 | 0.134 | 0.000 | 0.075 | 0.207 | 0.000 | 0.174 | 0.168 | 0.013 | 0.174 | 0.322 | 0.078 | 0.173 | 0.551 | 0.139 | 0.134 | 0.188 | 0.110 | 0.103 | 1.000 | 0.021 | 0.419 | 0.141 | 0.007 | 0.000 | 0.207 | 0.189 | 0.249 | 0.249 | 0.598 | 0.197 |
| habitat | 0.022 | 0.000 | 0.000 | 0.011 | 0.097 | 0.000 | 1.000 | 0.023 | 0.000 | 1.000 | 0.008 | 0.002 | 0.037 | 0.007 | 0.007 | 0.026 | 0.021 | 0.602 | 0.016 | 0.021 | 1.000 | 0.015 | 0.012 | 0.984 | 0.984 | 0.408 | 0.984 | 0.024 | 0.024 | 0.009 | 0.003 |
| sex | 0.444 | 0.005 | 0.000 | 0.120 | 0.159 | 0.000 | 0.513 | 0.424 | 0.015 | 0.513 | 0.321 | 0.081 | 0.337 | 0.262 | 0.196 | 0.223 | 0.332 | 0.294 | 0.317 | 0.419 | 0.015 | 1.000 | 0.256 | 0.011 | 0.012 | 0.311 | 0.311 | 0.316 | 0.316 | 0.595 | 0.190 |
| exposure_route | 0.443 | 0.020 | 0.024 | 0.069 | 0.182 | 0.000 | 0.269 | 0.237 | 0.013 | 0.269 | 0.400 | 0.067 | 0.241 | 0.444 | 0.139 | 0.161 | 0.225 | 0.066 | 0.157 | 0.141 | 0.012 | 0.256 | 1.000 | 0.000 | 0.000 | 0.112 | 0.102 | 0.137 | 0.137 | 0.562 | 0.380 |
| exposure_form | 0.013 | 0.000 | 0.000 | 0.000 | 0.047 | 0.000 | 0.329 | 0.027 | 0.000 | 0.329 | 0.009 | 0.000 | 0.007 | 0.000 | 0.000 | 0.007 | 0.008 | 0.250 | 0.005 | 0.007 | 0.984 | 0.011 | 0.000 | 1.000 | 1.000 | 0.277 | 0.512 | 0.012 | 0.012 | 0.009 | 0.013 |
| exposure_form_original | 0.012 | 0.000 | 0.000 | 0.000 | 0.049 | 0.000 | 0.255 | 0.020 | 0.000 | 0.255 | 0.009 | 0.000 | 0.005 | 0.000 | 0.000 | 0.005 | 0.006 | 0.215 | 0.008 | 0.000 | 0.984 | 0.012 | 0.000 | 1.000 | 1.000 | 0.263 | 0.443 | 0.012 | 0.012 | 0.008 | 0.013 |
| lifestage | 0.294 | 0.000 | 0.000 | 0.099 | 0.063 | 0.000 | 0.341 | 0.197 | 0.005 | 0.341 | 0.497 | 0.043 | 0.255 | 0.109 | 0.075 | 0.164 | 0.257 | 0.338 | 0.300 | 0.207 | 0.408 | 0.311 | 0.112 | 0.277 | 0.263 | 1.000 | 1.000 | 0.574 | 0.574 | 0.247 | 0.096 |
| lifestage_original | 0.295 | 0.000 | 0.000 | 0.099 | 0.069 | 0.000 | 0.405 | 0.180 | 0.004 | 0.405 | 0.497 | 0.043 | 0.233 | 0.109 | 0.075 | 0.150 | 0.234 | 0.366 | 0.274 | 0.189 | 0.984 | 0.311 | 0.102 | 0.512 | 0.443 | 1.000 | 1.000 | 0.575 | 0.575 | 0.247 | 0.096 |
| generation | 0.299 | 0.000 | 0.000 | 0.113 | 0.058 | 0.000 | 0.488 | 0.336 | 0.005 | 0.488 | 0.498 | 0.043 | 0.348 | 0.110 | 0.080 | 0.195 | 0.350 | 0.413 | 0.420 | 0.249 | 0.024 | 0.316 | 0.137 | 0.012 | 0.012 | 0.574 | 0.575 | 1.000 | 1.000 | 0.250 | 0.097 |
| generation_original | 0.299 | 0.000 | 0.000 | 0.113 | 0.058 | 0.000 | 0.488 | 0.336 | 0.005 | 0.488 | 0.498 | 0.043 | 0.348 | 0.110 | 0.080 | 0.195 | 0.350 | 0.413 | 0.420 | 0.249 | 0.024 | 0.316 | 0.137 | 0.012 | 0.012 | 0.574 | 0.575 | 1.000 | 1.000 | 0.250 | 0.097 |
| target_species | 0.519 | 0.006 | 0.000 | 0.021 | 0.202 | 0.001 | 0.950 | 0.805 | 0.012 | 0.950 | 0.361 | 0.127 | 0.486 | 0.765 | 0.231 | 0.238 | 0.449 | 0.255 | 0.325 | 0.598 | 0.009 | 0.595 | 0.562 | 0.009 | 0.008 | 0.247 | 0.247 | 0.250 | 0.250 | 1.000 | 0.180 |
| human_ra | 0.584 | 0.010 | 0.011 | 0.254 | 0.575 | 0.000 | 1.000 | 0.997 | 0.067 | 1.000 | 0.810 | 0.081 | 0.628 | 0.138 | 0.150 | 0.206 | 0.660 | 0.320 | 0.271 | 0.197 | 0.003 | 0.190 | 0.380 | 0.013 | 0.013 | 0.096 | 0.096 | 0.097 | 0.097 | 0.180 | 1.000 |
| toxval_id | source_hash | source_table | chemical_id | dtxsid | source | subsource | source_url | subsource_url | details_text | priority_id | qc_status | risk_assessment_class | human_eco | toxval_type | toxval_type_original | toxval_subtype | toxval_subtype_original | toxval_numeric | toxval_numeric_original | toxval_numeric_converted | toxval_numeric_standard | toxval_numeric_human | toxval_units | toxval_units_original | toxval_units_converted | toxval_units_standard | toxval_units_human | toxval_numeric_qualifier | toxval_numeric_qualifier_original | study_type | study_type_original | study_duration_class | study_duration_class_original | study_duration_value | study_duration_value_original | study_duration_units | study_duration_units_original | species_id | species_original | strain | strain_original | strain_group | habitat | sex | sex_original | critical_effect | critical_effect_original | population | population_original | exposure_route | exposure_route_original | exposure_method | exposure_method_original | exposure_form | exposure_form_original | media | media_original | lifestage | lifestage_original | generation | generation_original | year | year_original | mw | datestamp | source_source_id | toxval_uuid | toxval_hash | target_species | study_group | human_ra | visible | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1172305 | 0b0e4e6e5e435d48b4be88e3e9ecd6e4 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_5683e23c9d49ad53 | DTXSID4021557 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | fail:toxval_units not specified | short-term | human health | NOAEL | NOAEL | - | - | 500.0 | 500.0 | NaN | NaN | NaN | - | - | - | - | - | ~ | ca. | short-term | short-term repeated dose toxicity | - | - | 14.0 | 14 | days | days | 4510 | rat | Sprague-Dawley | Sprague-Dawley | Sprague-Dawley | - | - | - | - | - | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | - | - | 173.835 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172305 | - | 1 |
| 1 | 1172306 | 22cf87387c639816e5e1006735799f31 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_219c2db0693a8ca9 | NODTXSID | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | fail:dtxsid not specified | short-term | human health | NOAEL | NOAEL | - | - | 1000.0 | 1000.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | = | - | short-term | short-term repeated dose toxicity | - | - | 14.0 | range-finding: 14 days main study: males were dosed daily for 2 weeks prior to pairing, during the pairing period and a further 2 weeks before necropsy; a total of 6 weeks treatment prior to necropsy. females were dosed once daily for 2 weeks prior to pai | days | range-finding: 14 days main study: males were dosed daily for 2 weeks prior to pairing, during the pairing period and a further 2 weeks before necropsy; a total of 6 weeks treatment prior to necropsy. females were dosed once daily for 2 weeks prior to pai | 4510 | rat | - | - | - | - | M/F | male/female | - | other: | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | - | - | -1.000 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_dup_2 | - | 1 |
| 2 | 1172307 | f30fe8d16153bc99dc926223e225a889 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_b74a50ce531fcc60 | DTXSID4044400 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | fail:toxval_units not specified | short-term | human health | LOEL | LOEL | - | - | 60.0 | 60.0 | NaN | NaN | NaN | - | - | - | - | - | = | - | short-term | short-term repeated dose toxicity | - | - | 23.0 | 23 | days | days | 4913 | mouse | Hartley | Hartley | Guinea Pig | - | M | male | - | other: | - | - | oral | oral | - | unspecified | - | - | - | - | - | - | - | - | - | - | 322.375 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172307 | - | 1 |
| 3 | 1172308 | 900d2a78660511f77974e15f4d1c2468 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_09f8b3377e5beb16 | DTXSID5020607 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | short-term | human health | LOAEL | LOAEL | - | - | 2000.0 | 2000.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | = | - | short-term | short-term repeated dose toxicity | - | - | 14.0 | 14 | days | days | 4510 | rat | - | - | - | - | M/F | male/female | - | other: | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | 2004 | 2004 | 390.564 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172308 | - | 1 |
| 4 | 1172309 | 16ac7f18834d0aee5acdabce1ee15686 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_53e5726f2c6bb8ba | DTXSID90893847 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | subchronic | human health | NOAEL | NOAEL | - | - | 625.0 | 12500.0 | NaN | NaN | NaN | mg/kg-day | ppm | - | - | - | = | - | subchronic | sub-chronic toxicity | - | - | 13.0 | 13 | weeks | weeks | 4510 | rat | Fischer 344 | Fischer 344 | Fischer | - | M/F | male/female | body weight and weight gain | body weight and weight gain | - | - | oral | oral | feed | feed | - | - | - | - | - | - | - | - | - | - | 157.873 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172309 | - | 1 |
| 5 | 1172310 | 677e3a6c9a6e84d0ffe282e7d21758ce | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_1bff3ed13117e2c2 | NODTXSID | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | fail:dtxsid not specified | repeat dose other | human health | NOAEL | NOAEL | - | - | 250.0 | 250.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | > | > | repeat dose other | repeated dose toxicity | - | - | 55.0 | 40 days to 55 days | days | 40 days to 55 days | 4510 | rat | Not Specified | Wistar | Cat | - | M/F | male/female | - | other: | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | - | - | -1.000 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172310 | - | 1 |
| 6 | 1172311 | b26cb9881b4538bee770b41190046635 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_bd5288fddcb1d32f | DTXSID101057506 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | subchronic | human health | NOEL | NOEL | - | - | 20.0 | 20.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | = | - | subchronic | sub-chronic toxicity | - | - | 90.0 | 90 | days | days | 4510 | rat | - | - | - | - | F | female | - | other: | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | 2008 | 2008 | -1.000 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_dup_7 | - | 1 |
| 7 | 1172312 | c048e99b0b78841880210a892ac8611c | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_bd5288fddcb1d32f | DTXSID101057506 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | subchronic | human health | NOAEL | NOAEL | - | - | 500.0 | 500.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | = | - | subchronic | sub-chronic toxicity | - | - | 90.0 | 90 | days | days | 4510 | rat | - | - | - | - | M/F | male/female | - | - | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | 2008 | 2008 | -1.000 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_dup_7 | - | 1 |
| 8 | 1172313 | c2dcbc2691830db96be5db500a64848e | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_286bc0f57aec5148 | DTXSID1029835 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | subchronic | human health | NOEL | NOEL | - | - | 100.0 | 100.0 | NaN | NaN | NaN | mg/kg-day | mg/kg bw/day (nominal) | - | - | - | = | - | subchronic | sub-chronic toxicity | - | - | 90.0 | 90 | days | days | 4510 | rat | - | - | - | - | M/F | male/female | - | other: | - | - | oral | oral | gavage | gavage | - | - | - | - | - | - | - | - | - | - | -1.000 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_dup_8 | - | 1 |
| 9 | 1172314 | 54427e2f21d43579566d442faf2e97a1 | source_iuclid_iuclid_repeateddosetoxicityoral | ToxVal20111_16b0ddf5022c8e18 | DTXSID3026564 | ECHA IUCLID | Repeated Dose Toxicity Oral | https://echa.europa.eu/information-on-chemicals/registered-substances | - | ECHA IUCLID Details | 5 | pass | short-term | human health | NOAEL | NOAEL | - | - | 250.0 | 5000.0 | NaN | NaN | NaN | mg/kg-day | ppm | - | - | - | = | - | short-term | short-term repeated dose toxicity | - | - | 28.0 | males were exposed for 28 days, i.e. 2 weeks prior to mating, during mating, and up to termination. females were exposed for 41-48 days, i.e. during 2 weeks prior to mating, during mating, during post-coitum, and during at least 4 days of lactation. | days | males were exposed for 28 days, i.e. 2 weeks prior to mating, during mating, and up to termination. females were exposed for 41-48 days, i.e. during 2 weeks prior to mating, during mating, during post-coitum, and during at least 4 days of lactation. | 4510 | rat | - | - | - | - | M/F | male/female | - | other: | - | - | oral | oral | feed | feed | - | - | - | - | - | - | - | - | 1996 | 1996 | 402.572 | 2023-05-17 | NaN | - | - | Human | ECHA IUCLID_1172314 | - | 1 |
| toxval_id | source_hash | source_table | chemical_id | dtxsid | source | subsource | source_url | subsource_url | details_text | priority_id | qc_status | risk_assessment_class | human_eco | toxval_type | toxval_type_original | toxval_subtype | toxval_subtype_original | toxval_numeric | toxval_numeric_original | toxval_numeric_converted | toxval_numeric_standard | toxval_numeric_human | toxval_units | toxval_units_original | toxval_units_converted | toxval_units_standard | toxval_units_human | toxval_numeric_qualifier | toxval_numeric_qualifier_original | study_type | study_type_original | study_duration_class | study_duration_class_original | study_duration_value | study_duration_value_original | study_duration_units | study_duration_units_original | species_id | species_original | strain | strain_original | strain_group | habitat | sex | sex_original | critical_effect | critical_effect_original | population | population_original | exposure_route | exposure_route_original | exposure_method | exposure_method_original | exposure_form | exposure_form_original | media | media_original | lifestage | lifestage_original | generation | generation_original | year | year_original | mw | datestamp | source_source_id | toxval_uuid | toxval_hash | target_species | study_group | human_ra | visible | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 488512 | 4460042 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_bc75901eb4253428 | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Water + Organism | Human Health for the consumption of Water + Organism | - | - | 0.000120 | 0.000120 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488513 | 4460043 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_bc75901eb4253428 | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Organism Only | Human Health for the consumption of Organism Only | - | - | 0.000120 | 0.000120 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488514 | 4460044 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_84d3700651ac05bf | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Water + Organism | Human Health for the consumption of Water + Organism | - | - | 0.000018 | 0.000018 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488515 | 4460045 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_84d3700651ac05bf | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Organism Only | Human Health for the consumption of Organism Only | - | - | 0.000018 | 0.000018 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488516 | 4460046 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_adcdffd0d862e4a5 | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Water + Organism | Human Health for the consumption of Water + Organism | - | - | 0.000030 | 0.000030 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488517 | 4460047 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_adcdffd0d862e4a5 | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Organism Only | Human Health for the consumption of Organism Only | - | - | 0.000030 | 0.000030 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488518 | 4460048 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_d422e78b0edbdf3d | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Water + Organism | Human Health for the consumption of Water + Organism | cancer slope lower | cancer slope lower | 0.580000 | 0.580000 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488519 | 4460049 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_d422e78b0edbdf3d | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Water + Organism | Human Health for the consumption of Water + Organism | cancer slope upper | cancer slope upper | 2.100000 | 2.100000 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488520 | 4460050 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_d422e78b0edbdf3d | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Organism Only | Human Health for the consumption of Organism Only | cancer slope lower | cancer slope lower | 16.000000 | 16.000000 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |
| 488521 | 4460051 | - | source_epa_ow_nrwqc_hhc | ToxVal60127_d422e78b0edbdf3d | - | EPA OW NRWQC-HHC | - | source_url | - | EPA OW NRWQC-HHC Details | 2 | fail:human_eco not specified | water quality standard | not specified | Human Health for the consumption of Organism Only | Human Health for the consumption of Organism Only | cancer slope upper | cancer slope upper | 58.000000 | 58.000000 | NaN | NaN | NaN | mg/m3 | ug/L | - | - | - | = | - | - | - | - | - | -999.0 | - | - | - | 1000000 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2015 | 2015 | -1.0 | 2023-08-24 | NaN | - | - | - | EPA OW NRWQC-HHC_dup_1 | - | 1 |